32
XML Schema & e-GIF London, 26 th October 2005

eGif (Expunged)

Embed Size (px)

Citation preview

XML Schema & e-GIF

London, 26th October 2005

Agenda

• E-GIF Compliance• GovTalk Schema

Compliance

• Apart from the eGIF specification and technical documents there seems to be no freely available published guidance on e-GIF compliance, or even public discussion of the same

from the e-GIF Forum …

Topic: e-GIF Compliance Guidelines (1 of 2), Subject: e-GIF From: Ade [email protected] Date: 18 November 2004

Is anybody aware of the existence of freely available e-GIF compliance guidelines.

Topic: e-GIF Compliance Guidelines (2 of 2), Subject: e-GIF From: Daniel [email protected] Date: 18 November 2004

I don't know of any free guidance other than the GovTalk materials.

from http://www.egifcompliance.org/ …

Q. Could you please advise me if there are any policy documents regarding the use of XML? I understand there are, and that the use of XML is mandated. I am only able to locate pointers to evolving XML schemas.

A. The e-Government Interoperability Framework is the policy that specifies the use of XML as the primary means for data integration. See Part 1 for the general policies to govern that approach and Part 2 for more technical detail including a table which specifies a list of XML specifications for business areas.

so, from the e-GIF …

What does complying with the e-GIF mean?

6.3 At the highest level, complying with the e-GIF means:• providing a browser interface for access• using XML as the primary means for data integration• using Internet and World Wide Web standards• using metadata for content management.

6.4 These four elements are fundamental, but equivalent standards and additional interfaces are permissible. For example, government users will predominantly use a browser interface but may also have the option of Notes client access to an .nsf facility for knowledge management and workflow internally within the Knowledge Network.

6.5 The ultimate test for interoperability is the coherent exchange of information and services between systems. If this is achieved, then the system can be regarded as truly interoperable. Furthermore, it must be possible to replace any component or product used within an interface with another of a similar specification while maintaining the functionality of the system. To be e-GIF compliant, a system should satisfy both these requirements.

Compliance mapping to e-GIF

Primarily the metadata we apply to our XML Schema as content management metadata

Plus the potential extension profile we define for data quality and consistency?

Primarily employed in the data dictionary for attribute values

Primarily applies at the implementation level, though has some high-level design implications also (described next in this presentation)

The current ‘problem area’ and focus of this discussion

Technical Standards Catalogue• Simply a big list of technologies and standards to be

employed by all ‘e-GIF compliant’ projects.• Most any modern development would obtain default

compliance – pretty much the only real way not to comply would be to build a proprietary system from scratch. Or deploy a mainframe perhaps...

• The following slides contain the standards that would be of key applicability to our implementation…

ApplicabilityMetadata

Content management metadata definition XML Schema Government XML metadata schema will be held at

http://www.govtalk.gov.uk/schemasstandards/xmlschema.asp

A

Content management metadata elements and refinements

e-GMS which incorporates Dublin Core http://www.govtalk.gov.uk/schemasstandards/metadata.asp

A

Subject element, category refinement GCL (Government Category List) http://www.govtalk.gov.uk/schemasstandards/gcl.asp The GCL may be withdrawn when the next version of the e-GMS is issued; it is advised to

use the IPSV.

U

IPSV (Integrated Public Sector Vocabulary) http://www.esd.org.uk/standards/ipsv/

A

Data definition Government Data Standards Catalogue http://www.govtalk.gov.uk/schemasstandards/eservices.asp

A

ApplicabilityData Integration

Data integration metadata/meta language

XML (Extensible Markup Language) as defined by W3C http://www.w3.org/XML A

Data integration metadata definition

XML schema as defined by W3C, the specifications can be found at XML Schema Part 1: Structures http://www.w3.org/TR/xmlschema-1/structures

XML Schema Part 2: Datatypes http://www.w3.org/TR/xmlschema-2/datatypesGovernment XML schemas, for the latest versions see the GovTalk site at

http://www.govtalk.gov.uk/schemasstandards/schemalibrary.asp

A

Data transformation XSL (Extensible Stylesheet Language) as defined by W3C http://www.w3.org/TR/xsl XSL Transformation (XSLT) as defined by W3C http://www.w3.org/TR/xslt

A

Data description language RDF (Resource Description Framework) as defined by W3C http://www.w3.org/TR/REC-rdf-syntax/ . RDF can be used with OWL for adding semantics.

A

ApplicabilityWeb Services - General

Web service request delivery

SOAP v1.2, as defined by the W3C http://www.w3.org/TR/soap12-part1/http://www.w3.org/TR/soap12-part2/Guidance on the use of SOAP can be found at http://www.w3.org/TR/soap12-part0/ and

http://www.w3.org/TR/xmlp-scenarios/ See the W3C web site http://www.w3.org for the latest drafts of the SOAP specifications and transport bindings.

Web services may use SOAP version 1.1 as an interim solution provided there is a migration strategy for conformance to SOAP version 1.2.

A

Web service request registry

UDDI v3.0 specification (Universal Description, Discovery and Integration) defined by OASIS http://www.uddi.org/specification.html

Applicable for dynamic Web services requiring web service discovery using WSDL.

R

Web service description language

WSDL 1.1, Web Service Description Language as defined by the W3C, the specifications can be found at http://www.w3.org/TR/wsdl

A

TSC ApplicabilityWeb Services - Interoperability

Web service basic interoperability profile

Basic Profile Version 1.0 (BdAD Final Material) as defined by the Web Services Interoperability Organisation (WS-I)

http://www.ws-i.org/Profiles/BasicProfile-1.0-2004-04-16.html

R

Basic Profile 1.0 – Errata as WS-I http://www.ws-i.org/Profiles/BasicProfile-1.0-errata-2004-03-17.html

U

Basic Profile Version 1.1 as defined by WS-I http://www.ws-i.org/Profiles/Basic/2003-12/BasicProfile-1.1.pdf

U

Simple SOAP Binding Profile 1.0 as defined by WS-I http://www.ws-i.org/Profiles/Basic/2003-08/SimpleSoapBindingProfile-1.0.pdf

U

ApplicabilityWeb Services - Security

Web services security Basic Security Profile Version 1.0 (WS-I Security) as defined by WS-I http://www.ws-i.org/Profiles/BasicSecurityProfile-1.0-2005-01-20.html

A

RFC 2818: HTTP over TLS as defined by IETF http://www.ietf.org/rfc/rfc2818.txt

A

Web Services Security: SOAP Message Security 1.0 (WS-Security 2004) as defined by OASIS http://docs.oasis-open.org/wss/2004/01/oasis-200401-

R

What this means…• The interface specification will be described in the WSDL 1.1 for transport

over SOAP 1.2• The types of objects employed by the interface specification will be

described in XML Schema• The values of those object types will employ GDSC lists where applicable• WSDL and Schema will contain e-GMS metadata for schema management

purposes• WSDL and Schema must be shown to comply with WS-Interoperability

standards• In addition to physical security, web service messages will be secured

using WS-Security

But the e-GIF also says…Use of XML schemas and data standards

6.9 Systems are expected to use agreed XML schemas and agreed data standards listed in the GDSC, both of which are available on GovTalk. Should suitable schemas or data standards not be available, or if those available are deemed inadequate in some way, the system purchaser/sponsor should invoke the RFP/RFC processes immediately.

So …

• Bundling the use of the GovTalk schemas with the GDSC is unhelpful, but could be interpreted that the use of the GovTalk schemas is not necessarily mandatory

• A more detailed discussion as to why we might not necessarily want to employ these schema follows…

Agenda

• E-GIF Compliance• GovTalk Schema

Applicable Schema• Address and Personal Details 2.0 21/03/2005

– PersonDescriptiveTypes-v1-1.xsd– CitizenIdentificationTypes-v1-4.xsd– AddressTypes-v2-0.xsd– bs7666-v2-0.xsd– PersonalDetailsTypes-v1-3.xsd [Deprecated]– CommonSimpleTypes-v1-3.xsd– ContactTypes-v1-3.xsd(Note that apart from the schema themselves, no other documentation or

examples are included) • eGMS 3.0 Application Profile and XML Schema Binding

– for content management metadata– includes schema, an example and documentation

Schema Issues“The first versions were published in September 2000 with

ongoing regular and managed revision. It is not clear how successful it has been in 4 years. Measures of usage (or success?) do not appear to be published while other signs like continuing to stress “e-GIF is relatively new” and the low volume of contributors to the eGovTalk discussion board suggest it has yet to gain significant momentum.”

“White Paper: Applying the UK eGovernment Interoperability Framework (eGIF) to Geographic Information (GI) Services” -Geowise 31/01/2005

Schema Issues• Indeed, there appears to be a general lack of published

independent analysis and review on the either approach employed by e-GIF in mandating and developing such schema or technically in the schema themselves

• Why does the UK public sector need to have its own XML schema? Surely international standards and conventions are more appropriate...

• The lack of public forum activity raises questions regarding both the uptake and practical supportability of the schema

• Even after 4+ years of development the schema appear prone to large, non-trivial and non-backwards compatible revisions…

PersonDetailsTypes 1.3 [Deprecated]

PersonDescriptiveTypes 1.1 [New]

structure deleted

Breaking namespace change,

Breaking naming changes,

But no underlying data type changes!

structure deleted

structure deleted

• In data terms, previous implementations can re-map one-to-one to the new structure. Buts that’s hardly the point:– The previous example breaks previous implementations – i.e.

implementations employing the previous version will not be able to exchange data with newer implementations. At least not without re-work to such implementations and their dependencies

– Little, however, appears to be gained from this apart from naming changes

• The new structures display all the same attributes of brittleness that make them susceptible to such changes.

• An alternative approach might be…

This structure represents a higher level of abstraction, which is more reusable, and this one is also based on the BS standard

A person can have many well encapsulated names

A person’s name as a whole can change over time (this though could be applied to the NameElement itself)

A person’s name can be composed of many different elements in a specified order. Allowable values – and what those elements are called – are abstracted from the definition and placed in a separate, extensible CV. Changes therefore do not break previous implementations

As the comment says, this simply mirrors the previous interface but

breaks it due to the brittle specification

Trade-offsSpecific

• person:NameType is more complex, harder to understand and will therefore be more difficult to implement

• person:NameType is functionally richer allowing histories of names to be captured. Such functionality could be added to a PersonNameStructure compliant schema however

• person:NameType is specified at a higher level of abstraction making it more reusable and resilient to change.

• person:nameType is based on the BS naming standard. Future versions of PersonNameStructure may change to support this.

• PersonNameStructure provides a more precise specification – a PersonGivenName, for example, is defined as being of RestrictedStringType 1-35 characters in length. person:Element, being more abstract, needs to hold all NameElement data and is therefore of type String restricted to 70 characters. This is a trade-off in itself…

Trade-offsSpecification precision

The more highly specified the business process and use-case for an interface or message contract, the more precisely specified the interface can be. The advantage of high precision in an interface specification is that it prevents the exchange of messages that do not comply with the interface. In the previous example a message should not be sent if the PersonGivenName is less than 1 character in length or greater than 35 characters. In addition only certain characters are allowed. Such as are common in specialists systems and European languages (e.g. Günter is not valid according to the specification). If such a message is sent the receiver should reject it without further processing.

The person:NameType specification cannot place such up-front constraints on an element because, at its higher level of abstraction, it handles many different types of element – a person’s GivenName being just one. This is not to say that rules defining such constraints cannot be specified – they can – but they will be enforced through pre and post processing conditions and responsibilities, and not as part of the interface specification itself.

If the use-cases and business processes are not well defined at the time of specification, or if these are themselves liable to change, more abstract interfaces that are less sensitive to change should be preferred.

More examples

instead of multiple similar ones

small reusable structuresRequires no interfaces changes to add new national contact types; this mechanism also allows agencies to add their own specific types

More examples

In addition, the complexity of identifying reference numbers is hidden – neither the sender or receiver need to know whether or not a PassportNumber, for example, is in the old or new format prior to processing the message.

Note, this structure has been removed from the latest schema version, which now includes additional reference numbers

More examples

Using pattern restrictions to specify only what characters are allowed is useful for specific purposes

<xsd:simpleType name="RestrictedStringType"><xsd:restriction base="xsd:string"><xsd:pattern value="[A-Za-z0-9\s~!&quot;@#$%&amp;'\(\)\*\+,\-\./:;&lt;=&gt;\?\[\\\]_\{\}\^&#xa3;&#x20ac;]*"/></xsd:restriction></xsd:simpleType>

However, this is used extensively throughout the e-GIF schema as the base type for most strings. So,<PersonFamilyName>Günter</PersonFamilyName>

May not be valid, and the uses for data exchange between specialist systems limited.

More examples

Enumerations of coded values such as<xsd:simpleType name="GenderCurrentType"><xsd:restriction base="xsd:byte">

<xsd:pattern value="0"/><xsd:pattern value="1"/><xsd:pattern value="2"/><xsd:pattern value="9"/>

</xsd:restriction></xsd:simpleType>

are more useful, and resilient to change, than simple lists of strings: <xsd:simpleType name="MaritalStatusType"><xsd:restriction base="xsd:string">

<xsd:enumeration value="s"/><xsd:enumeration value="m"/><xsd:enumeration value="d"/><xsd:enumeration value="w"/><xsd:enumeration value="n"/><xsd:enumeration value="p"/>

</xsd:restriction></xsd:simpleType>

To be fair, a number of types have changed in the current version to support more this approach. So saying a still more abstract, and change resistant, approach is possible, where values are validated at run-time, e.g.

<xsd:simpleType name="CVValueCode"><xsd:restriction base="xsd:string"><xsd:maxLength value="100"/></xsd:restriction></xsd:simpleType>

Changes between versions• Change control is managed, but there appears to be a new version on average

every year.• Some changes are fault fixes, as to be expected in any interface. Long release

cycles could make implementations unviable, however, and many faults corrected between the previous and current versions have resulted in breaking changes between versions.

• Some changes are nomenclative (CitizenNameType to PersonNameType) – no change to the underlying data type, but breaking previous implementations nonetheless

• Some changes are structural (CitizenDetailsStructure and CitizenRegistration have been removed). These changes could be interpreted as stepping back from prior attempts to provide overly ambitious structures. If this move to provide more architectural, reusable schema fragments is intentional then it is to be welcomed.

• Overall, it would appear, the schemas remain specific (unlike their usage) and therefore susceptible to change.

Reminder“The ultimate test for interoperability is the coherent exchange of information and services between systems.

If this is achieved, then the system can be regarded as truly interoperable. Furthermore, it must be possible to replace any component or product used within an interface with another of a similar specification while maintaining the functionality of the system. To be e-GIF compliant, a system should satisfy both these requirements”

• Both are general design goals of modern interconnected systems readily and demonstrably satisfied using good interface design implemented using the W3C Web Services set of standards as per the Technical Standards Catalogue

“Systems are expected to use agreed XML schemas and agreed data standards listed in the GDSC, both of which are available on GovTalk. Should suitable schemas or data standards not be available, or if those available are deemed inadequate in some way, the system purchaser/sponsor should invoke the RFP/RFC processes immediately.”

• Use of the GDSC, and international standards catalogues where appropriate, is not a technical issue per se and can be complied with

• Use of the e-GIF/GovTalk XML Schemas has issues as outlined in this presentation. But they do not seem to be mandatory…

from the e-GIF Forum …

Topic: Benefits of e-GIF (1 of 2), Read 229 times Subject: e-GIF From: Barry [email protected] Date: 02 February 2005

I am preparing an e-GIF Strategy for English Partnerships this year. To engage the interest of our users, I need to provide them with some tangible benefits of e-GIF. Any contributions from organisations that have implemented e-GIF would be gratefully received.

Topic: Benefits of e-GIF (2 of 2) Subject: e-GIF From: Jim [email protected] Date: 25 April 2005

I find it interesting that nobody has replied positively to this thread. Surely someone must think there is a benefit to using e-gif. It would be even better if a "real life" user could respond rather than someone quoting the manual!!