27
#PBCAT The Lumberjack - Xpath 101 Thomas Weinert

Lumberjack XPath 101

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Lumberjack XPath 101

#PBCAT

The Lumberjack - Xpath 101Thomas Weinert

Page 2: Lumberjack XPath 101

About Me

● Application Developer● PHP● JavaScript● XSL

● papaya Software GmbH● papaya CMS● Technical Director

● FluentDOM

Page 3: Lumberjack XPath 101

Questions!

Please ask any time!

Page 4: Lumberjack XPath 101

Xpath 1

● XML Path Language● W3C Recommendation 16 November 1999● Used by

● XSLT 1● XPointer

Page 5: Lumberjack XPath 101

Xpath 2

● W3C Recommendation 23 January 2007● Superset of Xpath 1● More data types

Page 6: Lumberjack XPath 101

DOM

● Document Object Modell● Standard extension: ext/dom● LibXml2

● Xpath 1

Page 7: Lumberjack XPath 101

DOMXpath

● Create after loading the document!● evaluate()/query()<?php $str = '<sample><element/></sample>'; $dom = new DOMDocument(); $dom->loadXML($str); $xpath = new DOMXPath($dom); var_dump($xpath->evaluate('//element')); var_dump($xpath->evaluate('//noelement')); var_dump($xpath->evaluate('//noelement/@attr')); ?>

object(DOMNodeList)[5]

Page 8: Lumberjack XPath 101

SimpleXML

● Always return SimpleXML<?php $str = '<sample><element/></sample>'; $xml = simplexml_load_string($str);

var_dump($xml->xpath('//element')); var_dump($xml->xpath('//noelement'));

var_dump($xml->xpath('//noelement/@attr'));?>

boolean falsearray 0 => object(SimpleXMLElement)[2]

array empty

Page 9: Lumberjack XPath 101

XSL

● Libxslt● based on Libxml2

● ext/xsl● ext/xslcache

Page 10: Lumberjack XPath 101

Syntax

/element/child[@attr]

Absolute Path

Step 2

PredicateStep 1

Separator

Page 11: Lumberjack XPath 101

Nodes

● node()● * or qualified-name● text()● comment()● processing-instruction()

Page 12: Lumberjack XPath 101

Axes

● axis::...● Full syntax● Short Syntax● Default Axis

Page 13: Lumberjack XPath 101

child

<barcamps> <barcamp title="PHP Unconference Hamburg" id="phpuchh"> <link href="http://www.php-unconference.de/" /> </barcamp> <barcamp title="PHP Barcamp Salzburg" id="phpbcat"> <link href="http://www.phpbarcamp.at/cms/" /> <speakers-featured> <speaker>Bastian Feder</speaker> </speakers-featured> <speakers> <speaker>Thomas Weinert</speaker> </speakers> </barcamp> <barcamp title="PHP Unconference Europe" id="phpuceu"> <link href="http://www.phpuceu.org/"> </barcamp></barcamps>

Page 14: Lumberjack XPath 101

descendant

<barcamps> <barcamp title="PHP Unconference Hamburg" id="phpuchh"> <link href="http://www.php-unconference.de/" /> </barcamp> <barcamp title="PHP Barcamp Salzburg" id="phpbcat"> <link href="http://www.phpbarcamp.at/cms/" /> <speakers-featured> <speaker>Bastian Feder</speaker> </speakers-featured> <speakers> <speaker>Thomas Weinert</speaker> </speakers> </barcamp> <barcamp title="PHP Unconference Europe" id="phpuceu"> <link href="http://www.phpuceu.org/"> </barcamp></barcamps>

Page 15: Lumberjack XPath 101

parent

<barcamps> <barcamp title="PHP Unconference Hamburg" id="phpuchh"> <link href="http://www.php-unconference.de/" /> </barcamp> <barcamp title="PHP Barcamp Salzburg" id="phpbcat"> <link href="http://www.phpbarcamp.at/cms/" /> <speakers-featured> <speaker>Bastian Feder</speaker> </speakers-featured> <speakers> <speaker>Thomas Weinert</speaker> </speakers> </barcamp> <barcamp title="PHP Unconference Europe" id="phpuceu"> <link href="http://www.phpuceu.org/"> </barcamp></barcamps>

Page 16: Lumberjack XPath 101

following-sibling

<barcamps> <barcamp title="PHP Unconference Hamburg" id="phpuchh"> <link href="http://www.php-unconference.de/" /> </barcamp> <barcamp title="PHP Barcamp Salzburg" id="phpbcat"> <link href="http://www.phpbarcamp.at/cms/" /> <speakers-featured> <speaker>Bastian Feder</speaker> </speakers-featured> <speakers> <speaker>Thomas Weinert</speaker> </speakers> </barcamp> <barcamp title="PHP Unconference Europe" id="phpuceu"> <link href="http://www.phpuceu.org/"> </barcamp></barcamps>

Page 17: Lumberjack XPath 101

More Axes

● ancestor● ancestor-or-self● descendant-or-self● following● preceding● preceding-sibling● self

● attribute● namespaces

Page 18: Lumberjack XPath 101

Short Syntax

Axis Shortchildself .parent ..attribute @descendant-or-self /

● self::node()/descendant-or-self::node()/child::para

● .//para

Page 19: Lumberjack XPath 101

Cast Functions

● string()● number()● boolean()

echo $xpath->evaluate('string(/html/head/title)');

Page 20: Lumberjack XPath 101

Node Functions

● count()● last()● position()

● name()● local-name()● namespace-uri()

$list = $xpath->evaluate( '//*[local-name() = 'li' and position() = last()]');

Page 21: Lumberjack XPath 101

String Functions

● concat()● starts-with()● contains()● substring-before()● substring-after()● substring()● string-length()

● normalize-string()● translate()

Page 22: Lumberjack XPath 101

Match A Class

● normalize-string()● concat()● contains()

Page 23: Lumberjack XPath 101

Namespaces

● URN● Prefix● Default Namespace● Own Prefixes● Attributes

Page 24: Lumberjack XPath 101

Bug #49490

● Namespace prefix conflict$dom = new DOMDocument();$dom->loadXML( '<foobar><a:foo xmlns:a="urn:a">'. '<b:bar xmlns:b="urn:b"/></a:foo>'. '</foobar>');$xpath = new DOMXPath($dom);$context = $dom->documentElement->firstChild;$xpath->registerNamespace('a', 'urn:b');var_dump( $xpath->evaluate('descendant-or-self::a:*', $context) ->item(0)->tagName);

Page 25: Lumberjack XPath 101

Tools

● Firebug● Firefox AddOns

Page 26: Lumberjack XPath 101

CSS Selectors

● JavaScript libraries● element nodes

● *● no axes

● descendant-or-self::*● can ignore namespaces

● descendant-or-self::*[local-name() = '...']

Page 27: Lumberjack XPath 101

Thanks

● Web:● http://www.papaya-cms.com/● http://www.a-basketful-of-papayas.net/

● Twitter● @ThomasWeinert

● Joind.in● http://joind.in/1621