On Mon, 5 Mar 2001, Ralph Hartley wrote:
> [...]
>
> Whoever does the conversion has to decide how much of the original he
> wants to include. My proposal allows enough information to be included
> to reproduce the original station names exactly, images of the notations
> in the the survey book could even be used. Whether or not anyone thinks
> it is worthwhile to go back to the original names is not my choice,
> often I expect that they would, and it needs to be (and I think is) easy.
> As for units for the data, it could (and has) been argued either way. I
> would favor using the original units, because the conversion process is
> irreversible (round off, number of digits etc.). I would demand that
> there must be a STANDARD way to describe the units that were used, and
> that a units descriptor (which is also not usually found in the notes)
> must be REQUIRED (because that is not recreatable either).
> I might also accept allowing something like a "meters" attribute, for
> those that disagree (I see little need for it). As long as there is at
> least an option to include the original numbers.
> Ralph Hartley
I think my personal view differs fundementally from Ralph's.
I favor having canonical forms and units, BUT ALSO leaving the
original ones around. You then don't have to "recreate" the
original, you still have it. (And programs that do further
processing still have standard units, so that they don't
have to be aware of all the conversion strangeness.
In my view there would be the "original" text, and there
would be the parsed result (since only the parser is likely
to know the details of what was what and in what units)
[In this case I agree with Ralph that there should be a
standard way to do this.)
And there would be a cannonical (standard units, etc.) form
that further processing could use. (And I vote for metres.)
I would have "raw" point names, and "cannonical" point names
(Since they are a language specific issue).
My view of the world, which many may argue with (and have),
is that each processing pass adds to the file, with nothing
removed (except possibly some prior version of it's own stuff).
And my view is to go out of my way to make incremental updates
practical, even if that means some things are done differently.
--- New Suggestion ------------------------------------------
Back in the days when the US Military had lots of money for
software projects, there was a procedure that they used that
I think would be appropriate here.
They would solicit differing proposals, and minimal (trivial)
implementations, and then have everyone comment on each of
the proposals suggested. (Including having everyone come up
with a paper giving the specifics of what they thought were
the points that they thought their implementation did better
than the others.
Then a general vote was taken, and an effort was made to
bolt the good points of the losers onto the winner.
In our case we already have some examples of some different
XML DTD's form various people (and the CaveXML site has pointers
to many (if not most) of them. Some of them are very good,
although they have differing views of how the world operates.
(And my views are different than both.)
I think that, for me, the time has come to stop arguing details
of what CaveXML should be like, and go off and flesh out a
version matching my ideas so that my ideas can be argued with
examples (and trivial implementations) rather than with
academic arguments.
Devin and Michael have already put out things that match their
views of the world. (Although I'm not sure they have anything
that produces/processes it yet? Correct me if I'm wrong.)
It's time I (and other outspoken folk here) put my effort
where my mouth is. I've already got some fast, numberically
stable code for network adjustements (In Ada 95), so I think
I can put together a working example on that end of the
world with a reasonable amount of work.
Since the main discussion has ground down to just picking
at various points, rather than any radical changes, this
seems like the right time for us to go flesh out alternatives
so that people on the list will have as large a choice of ideas
to steal as is practical. (And, hopefully, we'll all notice
problems with our suggestions ourselves as we try conversions
on a few random datasets.)
Ralph? Paul? You all with me on this approach?
In any case, I'll be going off and writing code...
This archive was generated by hypermail 2b30 : Mon Apr 02 2001 - 18:00:00 CEST