[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

XML Guidelines: Design considerations



Hi, All

Apologies for late comments.

I would like to give some observation from our experience in design of
XML DTD for the IODEF (Incident Object Description and Exchange Format)
Data Model description. 
References for this work:
http://www.ietf.org/internet-drafts/draft-meijer-inch-iodef-00.txt
http://www.ietf.org/html.charters/inch-charter.html
http://listserv.surfnet.nl/archives/inch.html
http://www.ietf.org/rfc/rfc3067.txt

We had a good example of IDMEF data model and XML DTD which we committed
to maintain compatibility with.

However, extending machine oriented format of IDMEF to human oriented
description of computer security incidents with potential need for
prolonged processing/investigation and archiving incidents confronted us
with some XML design problems that may have common character and might
be useful to be addressed in the proposed XML Guidelines.

Short description and some considerations are given below, which can be
extended if discussion will follow.

General observation: the document gives a good insight into internal
structure of an XML document and general requirements to correct use of
XML BUT it still important to address some typical issues which normally
a designer of a protocol (or an application) face at the (initial)
design stage: s/he decided to use XML and knows how a good (formed) XML
document should look like BUT s/he still needs to solve a number of
design problems to come from technical idea/task to XML-based solution.

So, the problems to be addressed are:

1) what to use for description of the data model or exchange format: DTD
or Schema - this is a major discussion everywhere where people decided
to use XML
 
Considerations:

- DTD is simple and good enough, if we use it for rather straightforward
generation of data object/messages; DTD is easy readable and can be used
as a direct template for data generation;

- Schema has more benefits if there is an intention/need to use
integrated design environment (like database, web forms or web services)
for further development of application; 

- Schema gives more flexibility if data object expected to have several
profiles with using Schema based XML transformation and tools like 
XSL/XSLT, XPath

- Although Schema is more verbose is gives more flexibility for
datamodel/document evolution and maintenance

- there are many tools that use Schema as an input for further
applications development

- BUT Schema is much more difficult to work with (and some time
understand).

- Schema is the object oriented description and supports inheritance
relations, another related benefit of the Schema that it is naturally
integrated with the graphical presentation and design of data
model/structure


2) using attributes vs elements - although this topic is discussed
already in the document it may win if we give more suggestions;

This topic was a topic of detailed discussion in many XML Schema related
forums producing some useful recommendations. So, I would recommend to
extend section 4.9 with some of them (in addition to existing) from
sources below:

general archive of discussion -
http://www.oasis-open.org/cover/elementsAndAttrs.html
good summaries:
http://www.oasis-open.org/cover/holmanElementsAttrs.html
http://www.w3c.org/TandS/QL/QL98/pp/microsoft-serializing.html
http://www.oasis-open.org/cover/attrSperberg92.html

Summarising:

- put metadata into attribute, put content into attribute (one way to
distinguish between metadata and content is to ask question: "if I
remove this data/information, would my understanding of the content
change; if answer is "no", this is rather metadata, i.e. attribute or
descriptive information) 

- attributes are more suitable for enumerated data 

- elements are logical units of information

- use attribute for computer manipulated values

- entities (nodes) are expressed as elements

- properties (edges) and relations are expressed as attributes

- attributes are atomic characteristics of an element/object that have
no identity of itself, their meaning may change on element described


3) elements' identification and reference inside and outside the
document:

- this is often expressed via attributes but we need mechanism for
creation these functionality capable idents/attributes
- should ident reflect hierarchy of the datamodel: in well developed
standards like MIB and LDAP it's done but should it be followed in
particular cases
- should the use of XPath be recommended?

4) in section 4.5 about processing instruction might be useful to
mention about such standard processing instruction as XSL (XML Style
Language) together with XSLT, which may be very useful tool for
referencing complex data objects and creating profiles, it may be also
useful when XML document must be presented to different output formats
like web page or other formatted object

5) discuss whether problem of presentation of XML documents should be
also discussed in the document - it means talking about XML
transformation techniques XSL, XSLT and SAX

6) general recommendation to use good design tools for working with XML
DTD and/or Schema might be also useful, for complex data models it
should be strategic solution that will guarantee information exchange
and maintainability


So, if the documents pretends to be a "guidelines for the Use..."  and
not only requirements for use, it need to contain more design
considerations in most sections or separate design 
considerations section.


Again, sorry for late comments.

Regards,

Yuri Demchenko