31
XCube XCube XML For Data Warehouses XML For Data Warehouses By Sven Groot By Sven Groot

XCube

Embed Size (px)

DESCRIPTION

XCube. XML For Data Warehouses By Sven Groot. Data warehouses. Contains data drawn from several databases and external sources Provide a comprehensive view of all aspects of an enterprise Complemented by increased emphasis on powerful analysis tools SQL is inadequate - PowerPoint PPT Presentation

Citation preview

XCubeXCube

XML For Data WarehousesXML For Data Warehouses

By Sven GrootBy Sven Groot

Data warehousesData warehouses

Contains data drawn from several Contains data drawn from several databases and external sourcesdatabases and external sources

Provide a comprehensive view of Provide a comprehensive view of all aspects of an enterpriseall aspects of an enterprise

Complemented by increased Complemented by increased emphasis on powerful analysis emphasis on powerful analysis toolstools– SQL is inadequateSQL is inadequate– OLAP: OnLine Analytic ProcessingOLAP: OnLine Analytic Processing

Data WarehousingData Warehousing

External Data Sources

Operational Databases

ExtractCleanTransformLoadRefresh

Data Warehouse

Metadata repositoryServes OLAP

Visualisation

Data Mining

OLAPOLAP

Multidimensional data modelMultidimensional data model

8 1010

2030 50

1525 8

1 2 3

timeid

1 11 3

1 2pi d

locid

OLAP (cont’d)OLAP (cont’d)

Multidimensional Multidimensional data as a relationdata as a relation

locidlocid citycity statestate countrycountry

11 AmesAmes IowaIowa USAUSA

22 LeidenLeiden ZHZH HollandHolland

33 TempeTempe ArizonaArizona USAUSA

pidpid pnamepname categorycategory priceprice

1111 Lee JeansLee Jeans ApparelApparel 2525

1212 X-BoxX-Box ElectronicElectronicss

150150

1313 Biro PenBiro Pen StationeryStationery 22

pidpid timeidtimeid locidlocid salessales

1111 11 11 2525

1111 22 11 88

1111 33 11 1515

1212 11 11 3030

1212 22 11 2020

1212 33 11 5050

1313 11 11 88

1313 22 11 1010

1313 33 11 1010

1111 11 22 3535

1111 22 22 2222

Locations

Products

Sales

OLAP (cont’d)OLAP (cont’d)

Dimension as hierarchiesDimension as hierarchies

PRODUCT TIME LOCATION

category

pname

year

quarter

week month

date

country

state

city

OLAP (cont’d)OLAP (cont’d)

Typical OLAP queriesTypical OLAP queries– Find the total salesFind the total sales– Find total sales for each cityFind total sales for each city– Find total sales for each stateFind total sales for each state– Find the top five products ranked by Find the top five products ranked by

total salestotal sales Possible to drill-down and roll-up Possible to drill-down and roll-up

on dimensionson dimensions PivotingPivoting

eXtensible Markup eXtensible Markup LanguageLanguage Contains nodes that may be processing Contains nodes that may be processing

instructions, elements, attributes, CDATA sections instructions, elements, attributes, CDATA sections or comments.or comments.

Must be well-formedMust be well-formed Format can be defined by a DTD or XSD.Format can be defined by a DTD or XSD. Multiple formats in one document using Multiple formats in one document using

namespaces.namespaces. Can be transformed using XSLTCan be transformed using XSLT<?xml version=“1.0” encoding=“utf-8”?><?xml version=“1.0” encoding=“utf-8”?><!-- Library XML File --><!-- Library XML File --><Library xmlns=“http://www.liacs.nl/~sgroot/library.xsd”><Library xmlns=“http://www.liacs.nl/~sgroot/library.xsd”> <Book isbn=“0072322063” title=“Database Management Systems”><Book isbn=“0072322063” title=“Database Management Systems”> <Author name=“Raghu Ramakrishnan” /><Author name=“Raghu Ramakrishnan” /> <Author name=“Johannes Gehrke” /><Author name=“Johannes Gehrke” /> <Notes>Second Edition</Notes><Notes>Second Edition</Notes> </Book></Book></Library></Library>

Data Warehouses Data Warehouses ReloadedReloaded Data warehousing occurs across Data warehousing occurs across

departments all over the globe, and departments all over the globe, and also across companiesalso across companies

External datasources might include External datasources might include WWW and other data warehousesWWW and other data warehouses

One flexible format for exchanging One flexible format for exchanging data cubes would be useful: XCubedata cubes would be useful: XCube

XCube ScenariosXCube Scenarios

Download Download

XCube Scenarios XCube Scenarios (cont’d)(cont’d) QueryQuery

XCube Scenarios XCube Scenarios (cont’d)(cont’d) GeneratingGenerating

– Conversion of any data into data Conversion of any data into data cubecube

– Using data from a warehouse in data Using data from a warehouse in data cubecube

Requirements for Requirements for online cubesonline cubes Support for multidimensional data model.Support for multidimensional data model. Support for conceptual distinction between Support for conceptual distinction between

schema, dimension and fact data.schema, dimension and fact data. Transportable over the network.Transportable over the network. For flexibility and reuse linking and inclusion For flexibility and reuse linking and inclusion

concepts neededconcepts needed Extensible to adapt to different data models Extensible to adapt to different data models

or new conceptsor new concepts Easily convertible to and from various Easily convertible to and from various

sources and formatssources and formats Possibly allow OLAP processing to reduce Possibly allow OLAP processing to reduce

data transferdata transfer

XCube formatsXCube formats

XCubeSchemaXCubeSchema

XCube formats XCube formats (cont’d)(cont’d)

<<multidimensionalSchemamultidimensionalSchema versionversion="0.4"="0.4" xmlnsxmlns="http://www.xcube-open.org/V0_4/XCubeSchema.xcsd">="http://www.xcube-open.org/V0_4/XCubeSchema.xcsd"> <<cubeSchemacubeSchema idid="sale">="sale"> <<factfact idid="sales"="sales" />/> <<factfact idid="revenue"="revenue" />/> <<dimensiondimension idid="geography"="geography" granularitygranularity="branch"="branch" />/> <<dimensiondimension idid="product"="product" granularitygranularity="article"="article" />/> </</cubeSchemacubeSchema>> <<classSchemaclassSchema>> <!<!-- geography ---- geography -->> <<classLevelclassLevel idid="branch">="branch"> <<attributeattribute idid="manager"="manager" />/> <<rollUprollUp toLeveltoLevel="city"="city" />/> </</classLevelclassLevel>> <<classLevelclassLevel idid="city">="city"> <<rollUprollUp toLeveltoLevel="region"="region" />/> </</classLevelclassLevel>> <!<!-- ... ---- ... -->> <!<!-- product ---- product -->> <<classLevelclassLevel idid="article">="article"> <<attributeattribute idid="articleName"="articleName" />/> <<attributeattribute idid="brand"="brand" />/> <<rollUprollUp toLeveltoLevel="productGroup"="productGroup" />/> </</classLevelclassLevel>> <<classLevelclassLevel idid="productGroup">="productGroup"> <<rollUprollUp toLeveltoLevel="productFamily"="productFamily" />/> </</classLevelclassLevel>> <!<!-- ... ---- ... -->> </</classSchemaclassSchema>></</multidimensionalSchemamultidimensionalSchema>>

XCube formats XCube formats (cont’d)(cont’d) XCubeDimensionXCubeDimension

XCube formats XCube formats (cont’d)(cont’d)

<<dimensionDatadimensionData versionversion="0.4"="0.4" xmlnsxmlns="http://www.xcube-open.org/V0_4/XCubeDimension_base.xcsd">="http://www.xcube-open.org/V0_4/XCubeDimension_base.xcsd"> <<unitsunits>> <<entryentry unitTypeunitType="currency"="currency" unitunit="EUR"="EUR" />/> </</unitsunits>> <<classificationclassification>> <!<!-- dimension: geography ---- dimension: geography -->> <<levellevel idid="country">="country"> <<nodenode idid="Germany"="Germany" />/> <<nodenode idid="Switzerland"="Switzerland" />/> <<nodenode idid="France"="France" />/> <!<!-- ... ---- ... -->> </</levellevel>> <<levellevel idid="region">="region"> <<nodenode idid="Northern Germany">="Northern Germany"> <<rollUprollUp toNodetoNode="Germany"="Germany" levellevel="country"="country" />/> </</nodenode>> <<nodenode idid="Western Germany">="Western Germany"> <<rollUprollUp toNodetoNode="Germany"="Germany" levellevel="country"="country" />/> </</nodenode>> <<nodenode idid="Eastern Germany">="Eastern Germany"> <<rollUprollUp toNodetoNode="Germany"="Germany" levellevel="country"="country" />/> </</nodenode>> <<nodenode idid="Southern Germnamy">="Southern Germnamy"> <<rollUprollUp toNodetoNode="Germany"="Germany" levellevel="country"="country" />/> </</nodenode>> <!<!-- ... ---- ... -->> </</levellevel>> <!<!-- ... ---- ... -->> </</classificationclassification>></</dimensionDatadimensionData>>

XCube formats XCube formats (cont’d)(cont’d) XCubeFactXCubeFact

<<cubeFactscubeFacts versionversion="0.4"="0.4" xmlnsxmlns="http://www.xcube-="http://www.xcube-open.org/V0_4/XCubeFact_base.xcsd">open.org/V0_4/XCubeFact_base.xcsd">

<<cubecube idid="sale">="sale"> <<cellcell>> <<dimensiondimension idid="geography"="geography" nodenode="branch48"="branch48" />/> <<dimensiondimension idid="product"="product" nodenode="MA-450"="MA-450" />/> <<dimensiondimension idid="time"="time" nodenode="2003-07-24"="2003-07-24" />/> <<factfact idid="sales"="sales" valuevalue="3"="3" />/> <<factfact idid="revenue"="revenue" valuevalue="960"="960" />/> </</cellcell>> <<cellcell>> <<dimensiondimension idid="geography"="geography" nodenode="branch75"="branch75" />/> <<dimensiondimension idid="product"="product" nodenode="MA-450"="MA-450" />/> <<dimensiondimension idid="time"="time" nodenode="2003-07-24"="2003-07-24" />/> <<factfact idid="sales"="sales" valuevalue="2"="2" />/> <<factfact idid="revenue"="revenue" valuevalue="640"="640" />/> </</cellcell>> <!<!-- ... ---- ... -->> </</cubecube>> <!<!-- ... ---- ... -->></</cubeFactscubeFacts>>

XCube extended XCube extended formatsformats XCubeTextXCubeText

– Adds textual description for nearly Adds textual description for nearly every element.every element.

– Future version will allow separate Future version will allow separate files.files.

– Allows different levels of detail Allows different levels of detail (short, medium, long, html)(short, medium, long, html)

XCube extended XCube extended formats (cont’d)formats (cont’d) XCubeQueryXCubeQuery

– Organise interactive dialog between Organise interactive dialog between client and serverclient and server

– Meant to facilitate more efficient Meant to facilitate more efficient exchange of dataexchange of data

– Consists of seven different query Consists of seven different query formatsformats

XCubeQueryXCubeQuery

List of available cubesList of available cubes– Request:Request:<request><request> <getCubeSchemaList /> <getCubeSchemaList /></request></request>

– Response:Response:<cubeSchema id=”sale” /><cubeSchema id=”sale” /><cubeSchema id="purchase“ /><cubeSchema id="purchase“ /><cubeSchema id="stock“ /><cubeSchema id="stock“ />

XCubeQuery (cont’d)XCubeQuery (cont’d)

Getting the schema of a special cubeGetting the schema of a special cube– Request:Request:

<request><request> <getCubeSchema id=”sale”> <getCubeSchema id=”sale”></request></request>

– Response:Response:<cubeSchema id="sale“<cubeSchema id="sale“ xmlns:xs=”http//www.w3.org/2001/XMLSchema”> xmlns:xs=”http//www.w3.org/2001/XMLSchema”> <fact id="sales"> <fact id="sales"> <defaultAggregate> <defaultAggregate> <aggregation operator="sum"/> <aggregation operator="sum"/> <aggregation operator="max"/> <aggregation operator="max"/> <aggregation operator="min"/> <aggregation operator="min"/> </defaultAggregate> </defaultAggregate> </fact> </fact> <fact id="revenue"/> <fact id="revenue"/> <dimension id="geography" granularity="branch"/> <dimension id="geography" granularity="branch"/> <dimension id="product" granularity="article"/> <dimension id="product" granularity="article"/> <dimension id="time" granularity="xs:date" <dimension id="time" granularity="xs:date" stdLevel="true"/> stdLevel="true"/></cubeSchema></cubeSchema><dataTypes/><dataTypes/><unitTypes/><unitTypes/>

XCubeQuery (cont’d)XCubeQuery (cont’d)

Querying the Classification SchemaQuerying the Classification Schema– Request:Request:<request><request> <getClassSchema> <getClassSchema> <dimension id=”time”/> <dimension id=”time”/> <dimension id=”geography”/> <dimension id=”geography”/> </getClassSchema> </getClassSchema></request></request>

– Response:Response:

XCubeQuery (cont’d)XCubeQuery (cont’d)

<<classSchemaclassSchema xmlnsxmlns::xsxs="http://www.w3.org/2001/XMLSchema">="http://www.w3.org/2001/XMLSchema"> <<stdTimeClassLevelstdTimeClassLevel idid="xs:gYearMonth">="xs:gYearMonth"> <<rollUprollUp toLeveltoLevel="quarter"="quarter" />/> </</stdTimeClassLevelstdTimeClassLevel>> <<timeClassLeveltimeClassLevel idid="quarter"="quarter" timeBasetimeBase="quarter">="quarter"> <<rollUprollUp toLeveltoLevel="xs:gYear"="xs:gYear" stdLevelstdLevel="true"="true" />/> </</timeClassLeveltimeClassLevel>> <<classLevelclassLevel idid="branch">="branch"> <<attributeattribute idid="manager"="manager" />/> <<rollUprollUp toLeveltoLevel="city"="city" />/> </</classLevelclassLevel>> <<classLevelclassLevel idid="city">="city"> <<rollUprollUp toLeveltoLevel="region"="region" />/> <<addKeyaddKey levellevel="region"="region" />/> </</classLevelclassLevel>> <<classLevelclassLevel idid="region">="region"> <<rollUprollUp toLeveltoLevel="country"="country" />/> </</classLevelclassLevel>> <<classLevelclassLevel idid="country"="country" />/></</classSchemaclassSchema>><<dataTypesdataTypes>> <<dataTypedataType namename="quarter">="quarter"> <<xs:restrictionxs:restriction basebase="xs:gYearMonth">="xs:gYearMonth"> <<xs:patternxs:pattern valuevalue="[0-9]{4}-0[1-4]"="[0-9]{4}-0[1-4]" />/> </</xs:restrictionxs:restriction>> </</dataTypedataType>></</dataTypesdataTypes>><<unitTypesunitTypes />/>

XCubeQuery (cont’d)XCubeQuery (cont’d)

Querying Classification NodesQuerying Classification Nodes– Request:Request:

<request><request> <getClassNodes level=”branch”/> <getClassNodes level=”branch”/></request></request>

– Response:Response:<level id="branch"><level id="branch"> <node id="branch48"> <node id="branch48"> <rollUp toNode="Frankfurt" level="city"/> <rollUp toNode="Frankfurt" level="city"/> <attribute id="manager" value="Meier"/> <attribute id="manager" value="Meier"/> </node> </node> <node id="branch75"> <node id="branch75"> <rollUp toNode="Frankfurt" level="city"/> <rollUp toNode="Frankfurt" level="city"/> <attribute id="manager" value="Bauer"/> <attribute id="manager" value="Bauer"/> </node> </node> <!-- ... --> <!-- ... --></level></level>

XCube extended XCube extended formats (cont’d)formats (cont’d) XCubeFunctionXCubeFunction

– Still under developmentStill under development– Query XCube server about it’s Query XCube server about it’s

functionalityfunctionality

XCube formats XCube formats summarysummary

XCubeSchemaXCubeSchema

XCubeDimensionXCubeDimension

XCubeFactXCubeFact

XCubeTextXCubeText

XCubeQueryXCubeQuery

XCubeFunctionXCubeFunction

Related workRelated work

Common Warehouse MetamodelCommon Warehouse Metamodel MetaCube-XMetaCube-X XML for AnalysisXML for Analysis

Where from hereWhere from here

Basis for more complex and Basis for more complex and efficient infrastructure.efficient infrastructure.

Combination with XML Web Combination with XML Web ServicesServices

Evolution of XCubeTextEvolution of XCubeText Create new data warehouses with Create new data warehouses with

XCube standards.XCube standards.

ReferencesReferences

Wolfgang Hümmer, Andreas Bauer & Gunnar Wolfgang Hümmer, Andreas Bauer & Gunnar Hard; Hard; XCube – XML For Data Warehouses; XCube – XML For Data Warehouses; DOLAP’03, November 7, 2003.DOLAP’03, November 7, 2003.

http://www.xcube-open.orghttp://www.xcube-open.org Raghu Ramakrishnan & Johannes Gehrke; Raghu Ramakrishnan & Johannes Gehrke;

Database Management Systems, second Database Management Systems, second edition;edition; McGraw-Hill, 2000 McGraw-Hill, 2000

T. Bray, J. Paoli, C.M. Sperberg-McQueen; E. T. Bray, J. Paoli, C.M. Sperberg-McQueen; E. Maler;Maler; Extensible Markup Language Extensible Markup Language (XML) (XML) 1.0 (Second Edition) W3C Recommendation 6 1.0 (Second Edition) W3C Recommendation 6 October 2000October 2000http://www.w3.org/TR/REC-xmlhttp://www.w3.org/TR/REC-xml