Re: Requrements

From: Ralph Hartley (hartley@aic.nrl.navy.mil)
Date: Wed Mar 07 2001 - 18:01:11 CET


Devin Kouts wrote:

> Ralph Hartley wrote:
>
>> 1 It must be permitted to include information that not all programs
>> can, or wish to, understand.
>
> Information (data) in CaveXML falls into two categories, (1) part of
> the CaveXML baseline and, (2) extenstions to that baseline. Either
> type can handle your request, but I submit that the baseline should
> be somewhat constrained to the majority of data that we historically
> record. FOr instance, embedding an image file into the data file goes
> well beyond what we do now, but seems like a logical extension to
> CaveXML. Make such a capability an extension that over time could be
> moved into the baseline.

What I mean is that Bad Things shouldn't automatically happen when an
application that only understands baseCaveXML is given a file that
contains extended data.

>
>> 2 It must be possible - preferably easier than the alternative - to
>> preserve all information (including information the converter does
>> not understand) through a round trip to/from another format. If this
>> could be done even if the file was edited in the other format, that
>> would be a Good Thing.
>
> This is a function of how an application handles a CaveXML formatted
> file. If a program uses a CaveXML file, does some processing, and then
> modifies the original file (resulting in a loss of data) then it
> sounds like an issue for the application user to consider. I.e., do
> they want to keep using that application or not?

Notice that most of these principles are of the form "It must/should be
possible/easy to ...". Of course some applications will fail to do them,
by accident or design. The data format can only facilitate doing them.
XML, by making it easy to parse what you can't understand, advances this
goal all by itself.

Also, I would accept as fulfilling this goal a converter that kept a
copy of the original CaveXML file, and could merge changes in the
converted file back in.

> Valid data must flow from an authoritative source. If I give you the
> Twisted FIssure data in CaveXML and you use an app that bungs it all
> up, no big deal. Just contact me (or the authoritative source) and
> I'll give you another copy of the original data.

Can your great great granddaughter do that? People die, they loose
interest, they loose their own data. Also, my app may add information to
the file that I need, but isn't in the original.

>
>> 3 It should be possible to include enough information in a file to
>> allow a round trip from/to another format, reproducing the original
>> exactly. Conversion programs don't have to do that, but it should be
>> possible. For some formats, this may be impossibly hard.
>
> Not sure I follow you on this. If I have a complete record in the
> first CaveXML file and some application converts it to another XML
> (or other file type) and then converts it back to CaveXML, any
> resultant data loss is the fault of the application, not the original
> CaveXML file.

I am saying that the format must allow a CaveXML file to contain a
sufficiently complete record for this to be possible. I don't want the
application/converter author to be able to say "I couldn't do that
because CaveXML doesn't let me". I want it to really be the fault of the
application.

Note, that here I'm talking about otherformat->CaveXML->otherformat.
Number 2 covers the (trickier) CaveXML->otherformat->CaveXML round trip.

>
>>
>> 5 It should be possible to exclude any information from a file that
>> is not needed for the task at hand.
>
> Again, this is a function of the application processing data from a
> CaveXML file. Once the file is parsed and available in memory it is
> the applications choice as to what is uses from that collection of data.

But it should also be permitted to delete data from the file itself.
Some people will want to share lineplots, but not raw data. This is the
converse of the some of the other requirements. They say it should me
permitted to INCLUDE things, this says it should be permitted to EXCLUDE
things.

>
>> 6 There should be a standard way to record any information that is
>> common to at least two programs.
>
> I think you're talking about capturing those things that are commonly
> captured by most proprietary data storage formats. If that's so then I
> agree. This is the reason I did the matrix analysis of existing
> proprietary formats, to derive requriements for "common" information
> types (see it at www.psc-cavers.org/xml)

But you left out quite a bit. For instance, all of those programs can
store closed and unclosed lineplots in some form.

>
>> 7 There should be no absolute dependence on details of the survey,
>> or data reduction, process (e.g. station naming conventions).
>
> If you're implying that unique id's for stations become a requirement
> before you can put your data in a CaveXML file and call it valid, then
> I have to disagree. I can easily write an app that reduces cave
> survey data based upon the data we collect historically (without
> unique ID or IDREF). Things like ID or IDREF should be treated as
> optional. If your app needs them to do it's thing, then you can
> generate them as you need to. I don't need it so I won't waste the
> time writing the code.

I don't really want to argue this point (again) in this thread. Briefly
my point is, if you already have unique names for stations, it isn't
much extra code. If you don't, then all the kings horses and all the
kings men can't "generate them as you need to".

Perhaps, you could allow reference to stations using only names IF you
included in the spec "Station names must be unique according to the
following definition of uniqueness ...". This would violate Principle 11
(see below).

> There are many more. I actually put some effort into Requirements
> elicitation some months ago when I got the Cave Survey Data in XML
> effort rolling. I spoke with many of the poeple who write the apps
> that we use to render survey data and got their input. I reduced
> their comments into a requirements matrix (view it at
> http://www.psc-cavers.org/xml/Requirements.html). You can even read
> many of their original comments at
> http://www.psc-cavers.org/xml/Discussion.html.

I was one of them, remember? What I tried to write here are a bit higher
level, more in the nature of guiding principles.

I will also add:

11 It should be possible to determine if a file is a valid CaveXML file
using only commonly available tools. This means the definition of a
valid CaveXML file should be completely specified by a DTD or schema.

Ralph Hartley



This archive was generated by hypermail 2b30 : Mon Apr 02 2001 - 18:00:00 CEST