32
Technical University of Valencia Computer Science Department SOFSEM’07 (22/01/2007) A Program Slicing Based Method to Filter XML/DTD documents

Josep F. Silva Galiana

  • Upload
    locke

  • View
    52

  • Download
    1

Embed Size (px)

DESCRIPTION

Technical University of Valencia Computer Science Department. SOFSEM’07 (22/01/2007). A Program Slicing Based Method to Filter XML/DTD documents. Josep F. Silva Galiana. Contents. Motivation Program Slicing XML DTD XSLT - PowerPoint PPT Presentation

Citation preview

Page 1: Josep F. Silva Galiana

Technical University of Valencia Computer Science Department

SOFSEMrsquo07 (22012007)

A Program Slicing Based Method to Filter XMLDTD documents

2

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Program Slicing

3

Program Slicing

bull DefinitionDefinition Program transformation to extract the program statements that (potentially) affect the values computed at some point of interest

bull Origin Origin Originally introduced by Weiser

bull ExampleExample (1) read(n) (2) i=1(3) sum=0(4) product=1(5) while (ilt=n) do

begin(6) sum=sum+i(7) product=producti(8) i=i+1

end(9) write(sum)(10) write(product)

Slicing Criterion = (10 product)

4

Program Slicing

bull DefinitionDefinition Program transformation to extract the program statements that (potentially) affect the values computed at some point of interest

bull Origin Origin Originally introduced by Weiser

bull ExampleExample (1) read(n) (2) i=1(3) sum=0(4) product=1(5) while (ilt=n) do

begin(6) sum=sum+i(7) product=producti(8) i=i+1

end(9) write(sum)(10) write(product)

Slicing Criterion = (10 product)

5

Program Slicing

bull ApplicationsApplications Debugging Code understanding Specialization etc

All the applications are based on the Program Dependence Graphs (PDGs) (structure and behaviour of programs)

What would happen if Program Slicing was applied to a data structure Would it be interesting

6

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

XML

7

XML

bull OriginOrigin XML was developed by an XML Working Group formed under the auspices of the World Wide Web Consortium (W3C) in 1996

bull StructureStructure Documents are trees composed by lsquoELEMENTSrsquo which contain attributes

Example of XML document

XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

8

XML

bull ObjectiveObjective The purpose of a DTD is to define the legal building blocks of an XML document It defines the document structure with a list of legal elements

bull StructureStructure Documents are graphs composed by lsquoELEMENTSrsquo

Example of DTD document

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)

9

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status Name Surname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 2: Josep F. Silva Galiana

2

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Program Slicing

3

Program Slicing

bull DefinitionDefinition Program transformation to extract the program statements that (potentially) affect the values computed at some point of interest

bull Origin Origin Originally introduced by Weiser

bull ExampleExample (1) read(n) (2) i=1(3) sum=0(4) product=1(5) while (ilt=n) do

begin(6) sum=sum+i(7) product=producti(8) i=i+1

end(9) write(sum)(10) write(product)

Slicing Criterion = (10 product)

4

Program Slicing

bull DefinitionDefinition Program transformation to extract the program statements that (potentially) affect the values computed at some point of interest

bull Origin Origin Originally introduced by Weiser

bull ExampleExample (1) read(n) (2) i=1(3) sum=0(4) product=1(5) while (ilt=n) do

begin(6) sum=sum+i(7) product=producti(8) i=i+1

end(9) write(sum)(10) write(product)

Slicing Criterion = (10 product)

5

Program Slicing

bull ApplicationsApplications Debugging Code understanding Specialization etc

All the applications are based on the Program Dependence Graphs (PDGs) (structure and behaviour of programs)

What would happen if Program Slicing was applied to a data structure Would it be interesting

6

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

XML

7

XML

bull OriginOrigin XML was developed by an XML Working Group formed under the auspices of the World Wide Web Consortium (W3C) in 1996

bull StructureStructure Documents are trees composed by lsquoELEMENTSrsquo which contain attributes

Example of XML document

XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

8

XML

bull ObjectiveObjective The purpose of a DTD is to define the legal building blocks of an XML document It defines the document structure with a list of legal elements

bull StructureStructure Documents are graphs composed by lsquoELEMENTSrsquo

Example of DTD document

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)

9

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status Name Surname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 3: Josep F. Silva Galiana

3

Program Slicing

bull DefinitionDefinition Program transformation to extract the program statements that (potentially) affect the values computed at some point of interest

bull Origin Origin Originally introduced by Weiser

bull ExampleExample (1) read(n) (2) i=1(3) sum=0(4) product=1(5) while (ilt=n) do

begin(6) sum=sum+i(7) product=producti(8) i=i+1

end(9) write(sum)(10) write(product)

Slicing Criterion = (10 product)

4

Program Slicing

bull DefinitionDefinition Program transformation to extract the program statements that (potentially) affect the values computed at some point of interest

bull Origin Origin Originally introduced by Weiser

bull ExampleExample (1) read(n) (2) i=1(3) sum=0(4) product=1(5) while (ilt=n) do

begin(6) sum=sum+i(7) product=producti(8) i=i+1

end(9) write(sum)(10) write(product)

Slicing Criterion = (10 product)

5

Program Slicing

bull ApplicationsApplications Debugging Code understanding Specialization etc

All the applications are based on the Program Dependence Graphs (PDGs) (structure and behaviour of programs)

What would happen if Program Slicing was applied to a data structure Would it be interesting

6

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

XML

7

XML

bull OriginOrigin XML was developed by an XML Working Group formed under the auspices of the World Wide Web Consortium (W3C) in 1996

bull StructureStructure Documents are trees composed by lsquoELEMENTSrsquo which contain attributes

Example of XML document

XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

8

XML

bull ObjectiveObjective The purpose of a DTD is to define the legal building blocks of an XML document It defines the document structure with a list of legal elements

bull StructureStructure Documents are graphs composed by lsquoELEMENTSrsquo

Example of DTD document

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)

9

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status Name Surname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 4: Josep F. Silva Galiana

4

Program Slicing

bull DefinitionDefinition Program transformation to extract the program statements that (potentially) affect the values computed at some point of interest

bull Origin Origin Originally introduced by Weiser

bull ExampleExample (1) read(n) (2) i=1(3) sum=0(4) product=1(5) while (ilt=n) do

begin(6) sum=sum+i(7) product=producti(8) i=i+1

end(9) write(sum)(10) write(product)

Slicing Criterion = (10 product)

5

Program Slicing

bull ApplicationsApplications Debugging Code understanding Specialization etc

All the applications are based on the Program Dependence Graphs (PDGs) (structure and behaviour of programs)

What would happen if Program Slicing was applied to a data structure Would it be interesting

6

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

XML

7

XML

bull OriginOrigin XML was developed by an XML Working Group formed under the auspices of the World Wide Web Consortium (W3C) in 1996

bull StructureStructure Documents are trees composed by lsquoELEMENTSrsquo which contain attributes

Example of XML document

XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

8

XML

bull ObjectiveObjective The purpose of a DTD is to define the legal building blocks of an XML document It defines the document structure with a list of legal elements

bull StructureStructure Documents are graphs composed by lsquoELEMENTSrsquo

Example of DTD document

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)

9

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status Name Surname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 5: Josep F. Silva Galiana

5

Program Slicing

bull ApplicationsApplications Debugging Code understanding Specialization etc

All the applications are based on the Program Dependence Graphs (PDGs) (structure and behaviour of programs)

What would happen if Program Slicing was applied to a data structure Would it be interesting

6

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

XML

7

XML

bull OriginOrigin XML was developed by an XML Working Group formed under the auspices of the World Wide Web Consortium (W3C) in 1996

bull StructureStructure Documents are trees composed by lsquoELEMENTSrsquo which contain attributes

Example of XML document

XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

8

XML

bull ObjectiveObjective The purpose of a DTD is to define the legal building blocks of an XML document It defines the document structure with a list of legal elements

bull StructureStructure Documents are graphs composed by lsquoELEMENTSrsquo

Example of DTD document

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)

9

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status Name Surname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 6: Josep F. Silva Galiana

6

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

XML

7

XML

bull OriginOrigin XML was developed by an XML Working Group formed under the auspices of the World Wide Web Consortium (W3C) in 1996

bull StructureStructure Documents are trees composed by lsquoELEMENTSrsquo which contain attributes

Example of XML document

XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

8

XML

bull ObjectiveObjective The purpose of a DTD is to define the legal building blocks of an XML document It defines the document structure with a list of legal elements

bull StructureStructure Documents are graphs composed by lsquoELEMENTSrsquo

Example of DTD document

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)

9

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status Name Surname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 7: Josep F. Silva Galiana

7

XML

bull OriginOrigin XML was developed by an XML Working Group formed under the auspices of the World Wide Web Consortium (W3C) in 1996

bull StructureStructure Documents are trees composed by lsquoELEMENTSrsquo which contain attributes

Example of XML document

XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

8

XML

bull ObjectiveObjective The purpose of a DTD is to define the legal building blocks of an XML document It defines the document structure with a list of legal elements

bull StructureStructure Documents are graphs composed by lsquoELEMENTSrsquo

Example of DTD document

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)

9

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status Name Surname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 8: Josep F. Silva Galiana

8

XML

bull ObjectiveObjective The purpose of a DTD is to define the legal building blocks of an XML document It defines the document structure with a list of legal elements

bull StructureStructure Documents are graphs composed by lsquoELEMENTSrsquo

Example of DTD document

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)

9

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status Name Surname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 9: Josep F. Silva Galiana

9

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status Name Surname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

DTD DTD ((DDocument ocument TType ype DDefinition)efinition)XML XML (e(eXXtensible tensible MMarkup arkup LLanguage)anguage)

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 10: Josep F. Silva Galiana

10

XML

bull ObjectiveObjective XSLT is a language for transforming XML

bull StructureStructure An XSLT stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses a formatting vocabulary such as (X)HTML or XSL-FO

bull XSLT is a programming language

Example of XSLT document(Source Code)

XSLT XSLT (e(eXXtensible tensible SStylesheet tylesheet LLanguage anguage TTransformations)ransformations)

Example of XSLT document(Result)

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 11: Josep F. Silva Galiana

11

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Slicing XML Documents

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 12: Josep F. Silva Galiana

12

Slicing XML Documentsbull We see XML documents and DTDs as trees

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 13: Josep F. Silva Galiana

13

Slicing XML Documents

bull The Slicing Criterion is composed by a set of nodes in the tree

bull For each node in the slicing criterion we extract from the tree all those nodes that are in the path from the root to the node

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 14: Josep F. Silva Galiana

14

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 15: Josep F. Silva Galiana

15

Slicing XML Documentsbull XML backward slicing criterion

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Web Page(Original)

Web Page(Slice)

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 16: Josep F. Silva Galiana

16

Slicing XML Documentsbull XML backward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 17: Josep F. Silva Galiana

17

Slicing XML Documents

bull We distinguish between DTD and XML slicing criterionsbull XML slicing criterions are more fine-grained than DTD slicing criterions

bull We distinguish between forward and backward slices (or a combination)

Web Page(Original)

Web Page(Slice)

XML DTD

Forward Backward

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 18: Josep F. Silva Galiana

18

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

Slicing XML Documentsbull DTD backward slicing criterion

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Name Sched Course

SubjectStatus Name Surname

Name Year Budget

Project

PersonalInfo

Contact Teaching Research

ltELEMENT PersonalInfo (Contact Teaching Research)gt

ltELEMENT Contact (Status NameSurname)gt

ltELEMENT Status ANYgtltELEMENT Name ANYgtltELEMENT Surname ANYgtltELEMENT Teaching (Subject+)gtltELEMENT Subject (Name Sched

Course)gtltELEMENT Sched ANYgtltELEMENT Course ANYgtltELEMENT Research (Project)gtltELEMENT Project ANYgtltATTLIST Project

name CDATA REQUIREDyear CDATA REQUIREDbudget CDATA IMPLIED

gt

Web Page(Original)

Web Page(Slice)

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 19: Josep F. Silva Galiana

19

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 20: Josep F. Silva Galiana

20

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

Slicing XML Documentsbull XML backward-forward slicing criterion

Logic MonWed 16-184-Mathematics

Subject

Algebra MonTur 11-133-Mathematics

Professor Ryan Gibson

Subject

Syslog 2003-2004 16000 euro

Project

hellip

PersonalInfo

Contact Teaching Research

hellip

ltPersonalInfogtltContactgtltStatusgt Professor ltStatusgt ltNamegt Ryan ltNamegtltSurnamegt Gibson ltSurnamegtltContactgt ltTeachinggtltSubjectgtltNamegt Logic ltNamegtltSchedgt MonWed 16-18 ltSchedgtltCoursegt 4-Mathematics ltCoursegtltSubjectgtltSubjectgt ltNamegt Algebra ltNamegtltSchedgt MonTur 11-13 ltSchedgtltCoursegt 3-Mathematics ltCoursegtltSubjectgt hellipltTeachinggtltResearchgtltProjectname = ldquoSysLogrsquorsquoyear = ldquo2003-2004rsquorsquobudget = ldquo16000eurorsquorsquo gtltResearchgt

ltPersonalInfogt

Web Page(Original)

Web Page(Slice)

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 21: Josep F. Silva Galiana

21

Slicing XML Documents

bull What happens with DTDs Slices are well-formed but are they valid

bull For each XML slice we produce a DTD slice and viceversa

bull We guarantee that XML slices are valid with respect to DTD slices

DTD

document

SlicerSlicer

XMLdocument

DTD Slicedocument

XML SlicedocumentSlicing Criterion

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 22: Josep F. Silva Galiana

22

Slicing XML Documents

bull A simple slicing algorithm

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 23: Josep F. Silva Galiana

23

Slicing XML Documents

bull In the case of a DTD criterion composed by a set of positions C = p1hellippn Pos(D) the algorithm would be the same except that the first loop would be

For each v1v2(hellip)vn C do Vrsquo = Vrsquo v1 v1v2 hellip v1v2(hellip)vn Wrsquo = Wrsquo v1|iv2|j(hellip)vn|k Where v1v2(hellip)vn vrsquo and v1|iv2|j(hellip)vn|k X

Both algorithms produce valid XML and DTD slices with respect to the slicing criterion

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 24: Josep F. Silva Galiana

24

Slicing XML Documents

The following theorem states the correctness of the technique

Theorem Let D be a well-formed DTD and X a well-formed XML document valid with respect to D Given a slice Drsquo of D and a slice Xrsquo of X computed with an XML slicing criterion C and given a slice Drsquorsquo of D and a slice Xrsquorsquo of X computed with a DTD slicing criterion Crsquo then

a) Drsquo is well-formed and Xrsquo is valid with respect to Drsquob) Drsquorsquo is well-formed and Xrsquorsquo is valid with respect to Drsquorsquo

If all the elements in C are of one of the types in Crsquo then

c) Drsquo = Drsquorsquod) Xrsquo is a subtree of Xrsquorsquo

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 25: Josep F. Silva Galiana

25

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Implementation

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 26: Josep F. Silva Galiana

26

Implementation

We have implemented a prototype in Haskell

Haskell provides us a formal basis with many advantages for the manipulation of XML documents

- The HaXml library

It allows us to automatically translate XML or HTML documents into a Haskell representation In particular we use the following data structures that can represent any XMLHTML document

data Element = Elem Name [Attribute] [Content]data Attribute = (Name Value)data Content = CElem Element

| CText String

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 27: Josep F. Silva Galiana

27

XML XSLT WebPage

(Data)(Presentation)

Implementation

From XML slices to Webpage slices

XML XSLT WebPage

(Data)(Presentation)

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 28: Josep F. Silva Galiana

28

Implementation

XSLT Implementation Guidelines

XSLT documents must generate the information and the presentation elements under the same conditions (ie the former is generated if and only if the later is generated)

Both the XML data and the presentation labels are generated together

This does not imposes any restriction on the power of XSLT since the same webpages can be generated On the contrary this way of programming forces the programmer to build transformations that canbe easily reused and maintained because both the information and presentation data depending on the same condition are put together

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 29: Josep F. Silva Galiana

29

Implementation

XSLT Implementation Guidelines

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 30: Josep F. Silva Galiana

30

Implementation

The implementation some examples and other material is publicly available at

wwwdsicupves~jsilvaxml

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 31: Josep F. Silva Galiana

31

MotivationProgram SlicingXML

bull DTDbull XSLT

Slicing XML Documentsbull Example

ImplementationConclusions amp Future Work

Contents

Conclusions amp Future Work

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
Page 32: Josep F. Silva Galiana

32

Conclusions

We proposed the application of program slicing techniques to XML data structures

We defined an algorithm to slice XML and DTD documents

XML and DTD slices that are well-formed and valid Previous slicers can be used with a modest

implementation effort

Slicing Web Pages The slicer can use XSLT in order to slice webpages We proposed some guidelines to generate XSLT files

Future Work Migration to XML Schema New implementation based on XQuery

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32