DISA Technical Note, 2002-01
1.1 Document status
2.1 Business background
2.2 Example, OTA_VehAvailRateRQ.XSD
4.1 Early applications
4.2 Later applications
Exhibit 1. OTA_VehAvailRateRQ.xsd, XML Schema syntax
Exhibit 2. OTA_VehAvailRateRQ.xsd,
Componetizer HTML table output
This document is a technical paper for discussion in the general e-business community. Its distribution is unlimited. Style and formatting follow the Data Interchange Standards Association (DISA) publication guidelines.
Current version: Componetizer: a
tool for extracting and documenting XML Schema components, DISA Technical
Note 2002-1, 13 September 2002
This paper represents developments
in standards publishing technology developed by DISA, and while it documents
the work of DISA’s technical operations director Marcel Jemio, it also
reflects ideas and comments provided by DISA’s president Jerry Connors,
and vice-presidents Tim Cochran and Julia O’Brien.
The views expressed in this document are those of the authors and are not necessarily those of DISA. The authors and DISA specifically disclaim responsibility for any problems arising from correct or incorrect implementation or use of this information.
This document and the information contained
herein is provided on an "AS IS" basis. DISA disclaims all warranties,
express or implied, including but not limited to any warranty that the
use of the information herein will not infringe any rights or any implied
warranties of merchantability or fitness for a particular purpose.
The entire contents of the document are Copyright ã 2002, Data Interchange Standards Association, all rights reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to Data Interchange Standards Association, except as required to translate it into languages other than English.
2.1 Business background
The publication by the World Wide Web Consortium or W3C of the XML Schema 1.0 standard in May 2001 marked an important milestone in the development of XML as a business resource. XML Schema adds a number of key features to the basic eXtensible markup Language or XML 1.0 standard that give XML more power and flexibility:
Structure – XML Schema defines and catalogues XML vocabularies, describing the meaning, usage, and relationship of the constituent parts of those vocabularies
Datatypes – XML Schema provides more choices for describing the data contained in XML documents, including basic built-in (or primitive) datatypes and the ability for users to define their own datatypes
The basic XML 1.0 standard allows for hierarchical documents using sets of electronic rules called document type definitions or DTDs to define and validate their structure. DTDs offer a way to define the structural rules for XML documents, but they use a syntax different from XML itself, derived from the Standard Generalized Markup Language that preceded XML. Likewise, XML 1.0 offers limited datatypes beyond simple strings, enumerated lists, and boolean operators. XML Schemas, on the other hand, use XML itself, thus relieving the end-user of mastering a separate syntax for document definition.
For business applications, XML Schema opens up a wider range of options. It makes XML more adaptable to business databases that use relational or object-oriented structures, as well as offering the ability to support more types of data, including those defined by the users themselves. But all of these features come with a cost, namely more complexity. And while XML Schema offers a more powerful set of tools for business, using those tools requires more intensive hands-on management by implementers and developers.
One of the more demanding jobs in writing XML schemas is preparing the associated documentation that implementers and developers need to understand their structure and contents. While tools exist to help develop schemas, few if any tools are available to extract the individual data items from schemas and present them in an easy-to-read and understandable form.
XML editors have features that generate
documentation directly from the schema files. While the documentation produced
by commercial editors is often comprehensive, it can also get also voluminous
and contains much more than a simple table of components. The development
of XML vocabularies, a job normally undertaken as a collaborative process,
requires a way of capturing and documenting components from multiple schemas
in a direct and simple way. That is the function provided by the
2.2. Example, OTA_VehAvailRateRQ.XSD
Exhibit 1 and Exhibit 2 appended to this document offer an example of the need for a tool like the Compentizer, with one of the schemas from the OpenTravel Alliance (OTA) specifications 2002A. This schema, OTA_VehAvailRateRQ.XSD, sends a request message for car rental availability and rates.
This schema, part of a larger collection of schemas in the OTA specifications, follows two general schema format or complex types, VehicleAvailRQCoreType and VehicleAvailRQAdditionalInfoType. These complex types provide common components that other OTA car-rental schemas can reuse, with obvious efficiencies for schema and message designer.
Someone familiar with the syntax could probably read and understand the contents of the schema, as listed in Exhibit 1, and identify those components and their properties. But the components of the schema and their properties are seen and understood much more clearly in Exhibit 2.
The tables prepared by the Componetizer provide seven characteristics for each component:
Container. The name of the component
Asset. The general type of schema used in the overall schema. OTA defines, for example, OTA_CommonTypes.xsd and OTA_SimpleTypes.xsd schemas as its major categories.
Tag. The OTA naming conventions applied to the component’s XML tags.
Type. Basic data type, known as primitive in XML Schema
Restriction. Values allowed to represent the component.
Extension. Additional restrictions or allowances for component values.
Definition. Brief description of the component
Currently, the database schema is a series of tables mapped by identifiers to simulate a hierarchical relationship. XML Schema documents are hierarchical and capturing this content into a relational database is complex. Subsequent releases of the Componetizer will store content in a native XML database so as to take advantage of the inherent hierarchical relationships.
All programming logic is written in an OO paradigm, with code written to optimize processing speed.
4.1. Early applications
The Componetizer provides immediate payoffs in writing the documentation for schemas, including document type definitions or DTDs, developed by DISA’s affiliates. The most time-consuming part of the documentation is recording the details of the schema contents, which often involves tables. In the past, the specifications editors (industry volunteers or DISA staff) would manually capture these details in Word or Excel files. Any changes in the schemas would also mean adjusting or rewriting the tables, which for schemas of any complexity could take weeks.
With the Componetizer, however, DISA
can generate the tables automatically. This tool enables industry groups
to consider more comprehensive and complex schemas, as their business processes
demand. It also enables DISA to publish the documentation for the proposed
specifications more quickly.
4.2. Later applications
DISA uses the Componetizer now for documentation, but HTML tables are just one product from this tool. The Componetizer can be enhanced to provide graphical output (e.g., scalable vector graphics), spreadsheet formats, or word processing formats, as well as XML Topic Maps. With small adjustments, the Componetizer can also provide output in various database formats, which would enable DISA to establish a component store or warehouse. With the development of ebXML core components, the Componetizer could also assign core components as part of the database. With this step, DISA can also indicate where DISA affiliate components are semantically equivalent, which would provide a powerful interoperability feature.
This component warehouse will link to DISA’s registry initiative, known as DRIve. DRIve is a registry of standards and specifications developed by DISA's affiliated organizations, and compliant with version 2 the ebXML registry specifications. The component warehouse would assign a unique identifier to the schema components, which the registry would index as part of the metadata for the schema or component, depending on the level of detail appropriate for the specification.
The component warehouse could also support Web services applications. Universal Description, Discovery and Integration (UDDI) registries could list component identifiers as part of one or more tModel descriptions of services. Likewise, Web Services Description Language (WSDL) services could reference component identifiers as part of their Web service descriptions.
Marcel Jemio is DISA’s Director Of Technical Operations, serving as specifications manager for OpenTravel Alliance (OTA) since September 2001. For OTA and other DISA affiliated standards organizations, Jemio manages the direction and application of all XML related technology including XML Schema, XML Web Services (SOAP, ebXML MS 2.0, WSDL), XSLT, XPath, and SVG. Jemio also wrote the DISA Componetizer program that extracts XML components for rapid documentation of XML Schemas, and leads development of DISA’s XML Component Repository that inventories and maps several XML Schema vocabularies.
Jemio is DISA’s lead representative to the World Wide Web Consortium, and serves on the group’s Web Services Architecture and XML Schema working groups. Jemio also takes part in development of the ASC X12 XML Reference Model.
In previous work, Jemio served as a
systems engineer with Excelon Corporation, where he managed multiple development
teams, and deployed corporate Internet-based products for multiple clients.
Jemio also conducted client pre-sales meetings and presentations, facilitated
functional requirements sessions, and participated in the system/database
design, development and deployment of corporate products. He also has experience
in product and project management for other software and end-user companies.
Alan Kotok is DISA’s Director of Publishing and editor of E-Business Standards Today, published by DISA as an online daily newswire and in a weekly newsletter. Kotok previously served as DISA’s Director of Education and as standards manager for the OpenTravel Alliance.
Before joining DISA in 1999, Kotok served 10 years with Graphic Communications Association (GCA) as Director of Management Technologies and then as Vice President for Electronic Business. Before joining GCA, he served 15 years with U.S. Information Agency in the U.S. and overseas, becoming chief of the agency’s technology planning staff.
He is the author of two books, most recently ebXML: The New Global Standard for Doing Business on the Internet (with David Webber), ISBN: 0735711178, New Riders Publishing, August 2001. Kotok also writes frequently for the information technology trade press, and is author of three DISA white papers on e-business standards.
13 September 2002
Copyright ã 2002, Data Interchange Standards Association, all rights reserved