Re: <Comments>

New Message Reply About this list Date view Thread view Subject view Author view

From: devinkouts_at_earthlink.net
Date: Fri Jan 19 2001 - 16:13:12 CET


Received: (from mdom_at_localhost) by karto.ethz.ch (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) id QAA16730 for cavexml-outgoing; Fri, 19 Jan 2001 16:11:41 +0100
Received: from [209.70.170.131] (brick.cist.saic.com [209.70.170.131]) by karto.ethz.ch (8.9.3/8.9.3/SuSE Linux 8.9.3-0.1) with SMTP id QAA16724 for <cavexml_at_cartography.ch>; Fri, 19 Jan 2001 16:11:36 +0100
From: devinkouts_at_earthlink.net
Received: from cist.saic.com by [209.70.170.131] via smtpd (for karto.ethz.ch [129.132.127.159]) with SMTP; 19 Jan 2001 15:11:41 UT
Received: from earthlink.net (unverified [10.43.39.246]) by exmail.cist.saic.com (EMWAC SMTPRS 0.83) with SMTP id <B0000709118_at_exmail.cist.saic.com>; Fri, 19 Jan 2001 10:12:28 -0500
Message-ID: <3A685988.B7C780D@earthlink.net>
Date: Fri, 19 Jan 2001 10:13:12 -0500
X-Mailer: Mozilla 4.6 [en] (WinNT; U)
X-Accept-Language: en
To: cavexml_at_cartography.ch
Subject: Re: <Comments>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-cavexml_at_karto.baug.ethz.ch
Precedence: bulk

Good question Paul, and the first thing that occured to me was, "Gee
I've never seen it done that way, so it probably would be bad form to do
so at the very least". My next thought was, "but this could be an
example of 'thinking outside the box'". So I thought about the reasons
why you would want to use a <Comment> tag and then I tested what happens
when you don't use the <Comment> tag.

First - Using the <Comment> tag (or any tag) supports the whole intent
of XML, which is the explicit and concise identification of textual data
with some human readable mechanism (if that mechanism imparts meaning,
so much the better). Fundamentally the phrase "The compass was
refurbished by the manufacture on 1/1/2000" is delimited, by a Capital T
in the front and multiple white-space (or CRLF) at the end. This is
intuative to a literate human, but a little challenging to handle in
software. So tags make it easier for the machine But still this doesn't
answer the question "couldn't we just do away with the comment tag and
stick comment data anywhere we like". Then I applied the slippery-slope
logic, i.e. if you get rid of <Comment> then why not <Compass> or
<Shot>, <Survey> and everything else. I realized the result would again
be a collection of information that the human brain could parse easily
enough, but software couldn't.

Tags are very important to how the machine understands data in the file.
Tag's allow the machine to build a "tree-like" structure of the document
(known as the DOM, or document object model) and apply machine logic to
that structure, e.g. searching, iteration, modification, linking, etc.
Tags also create the traction new XML technologies (like Xpointer,
Xlink, or XQL) need to work with an XML data store. A simple example of
how the machine could use a <Comment> tag would include a report
generated by the machine that includes all the <Comment>ary included in
the survey. After locating all the data enclosed in <Comment> tags the
machine would have little difficulty preparing the report. But if the
machine had to search for all data that might be a comment, e.g.
basically not delimited and stuck in random places throughout the file,
you could get back more than you bargained for. Especially if you
decided to do away with more than just comment tags. If you weren't
using <Azimuth> tags you'd also get all your bearing data mixed in with
your comments.

Second - setting aside all the logical reasons, I decided to see what
would happen if you didn't use the <Comment> tag. I placed the following
XML snippet into a validating editor - XML SPY v3.0

<Instruments>
    The compass was refurbished by the manufacture on 1/1/2000
    <Compass>
       Here's a new comment on the compass
      <Correction units= "..."/>
      <SerialNumber/>
    </Compass>
</Instruments>

First I validated it with the editor and didn't get back any complaints.
This means it is a well formed XML document. But when I generated a DTD
from this snippet I got back some pretty interesting results, the first
indication of the confusion this practice could create. The DTD came out
like this...

<?xml version="1.0" encoding="UTF-8"?>
<!--DTD generated by XML Spy v3.0.7 NT (http://www.xmlspy.com)-->
<!ELEMENT Compass (#PCDATA | Correction | SerialNumber)*>
<!ELEMENT Correction EMPTY>
<!ATTLIST Correction units CDATA #REQUIRED >
<!ELEMENT Instruments (#PCDATA | Compass)*>
<!ELEMENT SerialNumber EMPTY>

This DTD lists Elements alphabetically, so Instruments doesn't appear
first as it would in the XML file. The interesting thing is how the
COMPASS element is defined...<!ELEMENT Compass (#PCDATA | Correction |
SerialNumber)*> This basically says Compass can contain #PCDATA
(parseable character data) and "any of" Correction or SerialNumber.
That's a little wishy washy, and not very strict. (The same thing occurs
in the Instruments definition.)

When I first put the <Comment> tag into the work I've been doing it was
at the request of several people who have actively contributed over
time, especially Ralph Hartley who stated the need to add commentary to
(basically) any location in the file. To meet that need without the
<Comment> tag you would see the same lack of disciplined structure
spreading across your entire data file.

So I guess my answer to your query would be this. The benefit to be
gained by using a <Comment> tag (i.e. clean, unambiguous information)
comes at a very low price. Furthermore, exlusion of a tag for
encapsulating data could set a precedent for bad practices later on. And
dealing with data that's not clearly delimited can be difficult for the
software programmer. While leaving out <Comment> tags is not the wrong
thing to do, it certainly doesn't seem to be the right thing to do. So I
would recommend using the <Comment> tag to enclose informal information
about the survey.

Great question Paul, I look forward to more opportunities to explore
things like this.

Devin Kouts

>I'm not saying place free text everywhere, just where you currrently
>allow one (or more?) comment blocks.
>
><Instruments>
> <Comment>
> The compass was refurbished by the manufacture on 1/1/2000
> </Comment>
> <Compass>
> <Correction units= .../>
> <SerialNumber/>

>Is that really any different than:
>
><Instruments>
> The compass was refurbished by the manufacture on 1/1/2000
> <Compass>
> <Correction units= .../>
> <SerialNumber/>
>...

>What role does the Comment tag serve? What role could it serve?
>
><Comment Language=English>
></Comment>

--
Devin Kouts
Caver
Systems Engineer
www.psc-cavers.org


New Message Reply About this list Date view Thread view Subject view Author view

This archive was generated by hypermail 2b30 : Wed Feb 14 2001 - 00:03:52 CET