Task: Draw up the survey data model
Progress 2005-05-02
This Edition: Modified only to suit the revised main Task List. Further progress awaits completion of Tasks 6 and 7.
Task context |
Previous versions.
|
This is a relatively informal page for recording our progress as we work
through the above task. When the task is finished, or as appropriate, material from any decisions which we have made will be consolidated to a more formal separate area. Nothing on this page is set yet - all aspects are up for discussion; even items marked as Done can be re-opened if later considerations indicate that they should.
Discussion Colour Codes:
Done |
Current |
Future
Page Contents
1. Task Description
Only the cave survey data model is included in this task. Further data models to be considered later in the project will be for cave topology and cave maps. The items below describe various aspects of this task.
1.1 Cave Survey Data Model
This sub-task is to define the cave survey data model; it's needed so that we can comprehensively define the data fields and their groupings which may be needed in a data transfer or archive. In order to construct a robust CaveXML standard which won't need
difficult changes later when we want
to add extensions, it is important that we first
understand and agree on the data and data structures
involved in a cave survey, i.e. the data model. Although our initial pilot
CaveXML standard is planned to address only a simple commonly used surveying
technique, we need to be aware also of where the further complications
lie, so that when we are designing the simpler pilot standard, we can
allow for the seamless inclusion of the extras later.
Therefore these draft entity definitions and the accompanying draft diagram attempt to model the
data in a cave survey up to the point where the station positions
have been calculated. The
closer our models get to the real-world things involved, the more
robust the result will be. There are sure to be further entities
needed, but the ones shown should do as a framework for starting the
discussion. Of course not all these entities would be used in any
given survey, or used by any existing survey reduction program or
data transfer; one would just use those entities which currently applied.
One of the difficulties of course is that some cave surveying terms have different meanings for cavers in different parts of the world. As we are working on an international solution for data transfer and archiving, we will need to recognise this fact and try to settle on a single definition for each term, just for use in this project, which can most easily be accepted. Our "Comments" included in such compromise definitions can acknowledge the alternative definition(s) which may exist.
Once we have a coherent set of entities and their relationships,
then we can determine what fields are needed and what entity
they belong to (many survey fields have already been enumerated by
various people on their websites). We will also need to define the
so-called "business rules", i.e. how to handle things when
certain conditions apply, such as what fields to include when a
particular surveying technique is being used. Another issue we will need
to address is how to achieve a range of sequences of data in an XML file to satisfy the various data groupings which people need. Once all this is done
we will be in a stable position to populate with fields the XML structures which we have decided upon to describe, store and convey the survey data.
Although XML is inherently capable of describing complex data
relationships more easily than a relational database, it would be best to model it first using relational database methods because these methods are well established and understood, and survey data will need to be stored
in a conventional relational database at various times for various reasons anyway, i.e. both approaches are needed if our work is to be accepted. Also, it may be that data can be stored more sequence-neutrally in a relational database than in XML, which may help us with the data sequencing issue mentioned above.
| Top
| CDX Home
| Main Task List
| Contents
| Task Description
| Plan
| Entity Defns
| Other Defns
| Relationships
|
1.2 Cave Survey Entities
The "Entities" referred to in this context are the real-world surveying things or events which we want
to record data about. A preliminary list for comment is shown below in alphabetical order. These are equivalent to the boxes shown in the
separate Entity-Relationship Diagram, which shows how the
entities are related to each other. Note that these entities are
different to
the "entities" referred to in XML and HTML syntax. The
detailed fields belonging to each entity are not being considered
at this stage of the task. The set of entities below covers the data model for the survey
measurements taken and the positions thereby calculated, though not
all surveys will involve all of the entities shown of course. Where a
definition uses terms appearing elsewhere in this list, an initial
capital letter has been used for those terms. Comments, examples and
a few possible fields for each entity have also been included to give a better feel for what the entity is about.
1.3 The Entity-Relationship Diagram
In the Entity-Relationship Diagram (ERD) each square represents an
entity, and the lines joining them represent any relationships between
them which are
relevant to our purpose. The label along the line gives some idea of
the type of relationship. These relationships are spelled out more fully in the text version of the diagram in the Entity Relationships section below.
Where the line has an arrowhead on one end, it means
that one of the entities at
the non-arrowhead end may be related to more than one of the entities
at the arrowhead end, but not vice-versa. This is called a
"one-to-many" relationship. For example,
a CaveSystem could contain several Caves but a Cave would not normally
belong to several CaveSystems.
Where the line has an arrowhead on both ends, it is called a
"many-to-many" relationship. For example,
one Person could belong to several Teams, and one Team contains several Persons.
You can also have a "one-to-one" relationship,
where one entity at one end of the line is related to only one
instance of the entity at the other end.
View ERD in this Window, or
(Note: If you're using Internet Explorer 6 with the above new-window option, the diagram may retreat to an indistinct image at the left side of the page; if so, hover your mouse over this image until a 4-arrow thing appears, which you can then click on. This should expand the diagram to its proper size. Does anyone know how to stop IE6 shrinking the image ("fit-to-screen")? It does not happen with Netscape or Mozilla.)
(Technical note: The original of the diagram is vector-based in StarOffice 5.2 Draw. It is then exported to GIF format for the web page.)
| Top
| CDX Home
| Main Task List
| Contents
| Task Description
| Plan
| Entity Defns
| Other Defns
| Relationships
|
2. Task Plan
The detailed steps of the plan are listed below. Task Context.
Draw up the survey data model:
- Sort out any tricky or non-entity terms which
we may need to use during discussions.
- Begin at the top end of the Cave Survey E-R Diagram and discuss each
entity in logical order down the page until all definitions and
relationships for this data model are agreed. Only a token number of
fields would be discussed during this step - just enough to clarify
how the entity would fit in.
- When an entity's definition and how it fits has been agreed,
the entity will be updated on this web page and on the diagram to show what
we have agreed, and its colour set to "Done". Earlier versions
will remain available.
| Top
| CDX Home
| Main Task List
| Contents
| Task Description
| Plan
| Entity Defns
| Other Defns
| Relationships
|
3. Cave Survey Entity Definitions
The draft definitions below are being set for our specific CaveXML
purpose, and may differ from definitions used for other purposes,
i.e. we are not trying to set up normative definitions for cave
survey terms for universal use, though hopefully most of them could
serve that purpose anyway. Some of these terms may have differing existing definitions in different parts of the world; for use in this project, we will need to settle on a single definition for each, while acknowledging any alternative definitions in the respective "Comments".
- Branch
- A survey network element which is a sequence of one or more Legs
which join two adjacent Nodes in the network.
Possible fields: node1 name, node2 name, leg(s).
- Cave
- A single cave which is being surveyed.
Possible fields: name, survey IDs, parent cave system,
...
- CaveSystem
- A collection of related Caves, or a complex Cave, which is
being
surveyed.
Possible fields: caves, projects.
- Fieldbook
- A book or identified collection of documents in which the survey
readings were originally recorded during the surveying. It may
contain material related to several surveys, projects, and/or caves.
Possible fields: book ID, owner, caves.
- Instrument
- A specific surveying instrument used during a survey Segment. An
Instrument's corrections might change from Segment to Segment.
Possible fields: instrument type, serial number, owner,
manufacturer,
model, date last serviced, date manufactured, corrections and their
dates.
- Interpoint
- An intermediate point along a Shot or Leg from which additional
observations are taken. Interpoint positions can be determined by
their distance along a Shot from a specified one or other Station. An
Interpoint does not form a necessary part of the survey network
structure. For example, additional cross-sections may have been taken
at Interpoints by means of Rays.
Possible fields: station from, distance from station, rays
taken.
- Leg
- The set of final consolidated survey Measurements related to the
connection between two adjacent Stations according to the surveying
Technique being used. For example, it might be the result of traverse
readings with multiple Shots or sightings in one or both directions
and averaged readings. A Leg could be the set resulting from the
survey measurements, or it could be the statistical set resulting
from later adjustment of the network.
Possible fields: station from, station to, distance,
direction,
vertical angle, segment, averaged yes/no.
- Map
- A visual representation resulting from one or more Surveys.
Possible fields: map ID, name, size, horizontal scale(s),
vertical
scale(s), drafter(s), producer(s).
- Measurement
- A Measurement is one of the fundamental quantities which was
used
to calculate the position of a target, usually a new Station, from
the Position of an existing Station. For example, in a normal tape
and compass traverse the Measurements between the two Stations would
be: direct distance, horizontal direction, and vertical angle. (Other
cases: (1) independently determined, e.g. by GPS, (2) by
triangulation, etc). A Measurement can be used in both a Leg and a
Shot, though in a Leg, the Measurement may be the result of
determining the final (statistical) Position of the new Station.
Possible fields: name, value, units, item type being
measured,
item
being measured, technique, method.
- Method
- The surveying and calculation method actually used for obtaining
the values for one of the Measurements of the survey. For example,
the "distance" Measurement could be obtained by any of the
following Methods: tape or rangefinder (a direct single
measurement), topofil
(difference of two readings), stadia staff (two intercepts and a
vertical angle), etc. The chosen Method will affect how many values
are contained in a Shot. There will be a range of Methods defined,
and new ones will be needed from time to time. This is a lookup
reference entity rather than containing values from any specific
survey.
Possible fields: name, qty of measurements required,
instrument
types(s) required.
- Node
- A survey network element which is a Position of a Station which
is the meeting point of more than two Legs, or which is otherwise
needed in manipulating the network.
Possible fields: name, legs connected.
- Organisation
- An organisation associated with any aspect of a Survey or survey
Project.
Possible fields: name, code, initials, members.
- Person
- A person participating in any aspect of a Survey or survey
Project.
Possible fields: name, contact details, orgs associated
with.
- Point
- A physical point occupied by one or more Stations. It may or may
not have a physical marker. It may have multiple sets of Position
co-ordinates from its occupying Stations. It may have several names
ranging from an official government designation to a series of cave
survey station names.
Possible fields: name(s)+ nametype(s), marker type, date
marked,
person placing mark, org placing mark.
- Position
- A calculated location for a Station or Interpoint. There may be
several Positions for the one Station, each derived from, for
example, different Segments and/or adjustment Techniques.
Possible fields: easting, northing, co-ord system used,
altitude,
height datum used, units, latitude, longitude, station, program,
program version, adjustment technique, date calculated, program
operator, segment.
- Project
- A cave/karst survey and mapping project considered to require
extended work over multiple Trips and possibly comprising multiple
Surveys.
Possible fields: name, date started, leader, org(s)
involved,
cave
system.
- Ray
- A set of Measurements from a Station or Interpoint to a target
point where the latter does not occupy a formal Station. For example,
"left", "right", "up" or
"down"
sightings would each be an example of
a Ray where only one Measurement was recorded, the other
Measurements and target being implied. A Ray is a specialised kind of
Shot which is different enough to warrant its own entity.
Possible fields: ray type, distance, station from,
interpoint from, target, horizontal direction, vertical angle.
- Role
- One of the types of task being performed in a particular survey
segment. Examples: compass reader, elevation reader, tape reader,
sketcher, recorder, data processor, etc. The key for a Role would be
a combined segment+person, with role-type as a field. If the
role-type was unknown in a particular instance, this field could be left
blank or given a value of, say, "Unknown".
Possible fields: segment used in, person, role name.
- Segment
- Part or all of a cave/karst survey Trip which is carried out
under a single set of conditions such as Team members, Instruments,
Technique, Methods, instrument corrections, etc. That is, a Segment
is the largest component in a Survey (highest part in the hierarchy)
to which these other values can be attached. A Segment could consist
of Stations but no Legs, if the Station Positions were being
determined directly. A single Leg later joining two Segments would be
a new Segment. Because a Segment has a single set of conditions, it
could be manipulated as a single unit if desired.
Possible fields: leg(s), role(s), trip, instrument(s),
technique, ray
type.
- Shot
- A set of the actual survey readings resulting from one sighting
between two adjacent Stations before any Instrument or other
corrections have been applied. The number of readings in the set for
a Shot will be determined by the Technique and Methods being used,
and hence the number of different Measurements required and the
number of readings needed for each Measurement, e.g. two for a
Topofil length. Repeated sets of readings for the purpose of
averaging are considered to be separate Shots. In the simplest case,
a single Shot becomes the Leg between two Stations.
Possible fields: station to, station from, tape distance,
magnetic
bearing, vertical angle.
- Spur
- A survey network element which is a sequence of one or more Legs
and which is connected to the rest of the network by only one end.
Possible fields: node, leg(s).
- Station
- A named end of a survey Leg or a directly established point,
at a
particular physical Point. Its location could be the result of Shots
from or to other Stations, or of independent observations such as by
GPS or by radio or electromagnetic methods. A Station may have one or
more Positions, for example the original position calculated by the
the survey field measurements, and also as the result of correction
processes such as loop closure or statistical adjustment of the
survey mesh, or by later resurveys. The same Station could be in more
than one Segment.
Possible fields: name, point, leg(s), shot(s),
position(s).
- Survey
- A related collection of cave/karst survey data which can stand
alone, or may form part of a larger survey Project.
Possible fields: cave(s), project, trip(s), location of
data,
date
started.
- Team
- A semi-permanent group of People who carry out cave surveys. A
particular Segment may have been surveyed by a particular Team, or by
a group of people not belonging to any formal Team. The informal
"team" of People who have carried out the surveying in a
particular Segment can be found by examining all the Roles related
to that
Segment.
Possible fields: name, members, formation date.
- Technique
- The type of surveying technique used in a surveying Segment to
enable the calculation of the position of each Station, and also the
calculation technique possibly used later in its mathematical
adjustment. Technique examples are traverse, triangulation,
resection, GPS, and the various survey adjustment techniques. A
survey Technique will require a particular set of Measurement types,
and each Measurement type will be obtained by using a particular
Method. If a survey Segment field recorded the use of a particular
Technique, then a program would use the "rules" for that
Technique
to guide its subsequent action on the various Measurements. This is a
lookup reference entity rather than containing values from any
specific survey.
Possible fields: name, measurement type(s), purpose.
- Trip
- A cave survey trip in which one or more Segments of survey for a
Survey or Project are carried out during one nominally continuous
time period.
Possible fields: name, start date, end date, survey
belonged
to.
| Top
| CDX Home
| Main Task List
| Contents
| Task Description
| Plan
| Entity Defns
| Other Defns
| Relationships
|
4. Other definitions
These are other tricky or non-entity definitions which we may need to use during our
discussions, and hence will need to agree on beforehand. We will of course end up
eventually defining all the fields which belong to the entities, but these
ones below need clarification early on. Any others? We can accumulate
them here until they get covered in specific field definitions. Some of these terms may have differing existing definitions in different parts of the world; for use in this project, we will need to settle on a single definition for each, while acknowledging any alternative definitions in the respective "Comments".
4.1 General
- Altitude
- The height of a point above or below mean sea level.
Comments:
- Altitude as opposed to Elevation, because the latter can be confused with Inclination (vertical angle).
- Azimuth
- The horizontal direction of a line of sight measured clockwise from the North line
in the range 0-360 degrees or equivalent. It will be a True, Grid,
Magnetic, or Assumed Azimuth depending on whether the North line is
True, Grid, Magnetic, or Assumed.
Comments:
- Azimuth as opposed to Bearing.
- Bearing
- The horizontal direction of a line of sight measured from a North, South, East or
West line in the range 0-90 degrees or equivalent. It will be a True,
Grid, Magnetic, or Assumed Bearing depending on whether the line is
True, Grid, Magnetic, or Assumed. Example: Bearing E30°S (meaning 30° South of the East line, which is also Azimuth 120°).
Comments:
- Bearing as opposed to Azimuth.
- This is not common usage of the term in cave
surveying of course, but we may need to discuss such measurements and
will need a term for it, so we might as well use the correct one.
"Bearings" are likely to arise if integrating professional, land, or historic surveys with our cave surveys.
- The term "Quad" is also sometimes used for this type of reading.
- Entity
- A real-world thing, event or concept that we want to record
data about. Examples are Cave, Instrument, Trip.
Comments:
- Entities are represented by the squares in the diagram.
- Note that these entities are different to the "entities" referred to in XML and HTML syntax.
- Field
- A property of an Entity. For example in a database table, the fields are normally represented by the columns in the table:
"start date" is a field of the entity "trip", so in a table which
listed trips, [Start Date] would be one of the columns. And in an XML file,
any particular field would be represented either as an "element"
or sub-element in its own right, or as an "attribute" of another
element, depending on various considerations about that field.
Comments:
- Fields
are not being shown in the diagram, but will be listed out when
we come to discuss them in detail later in the Task.
- Inclination
- The angle in a vertical plane between a line of sight and the horizontal, positive above the horizontal and negative below.
Comments:
- Inclination as opposed to Elevation, because the latter can be confused with Altitude.
- Traverse
- A general term for a contiguous series of Legs in a survey. It
may span several Trips and several Methods, etc.
4.2 Survey Data Stages
These are the terms we have decided to use for the processing stages which a particular set of survey data might go through between the original survey readings in the cave and the calculated and adjusted co-ordinates ready for preparing a map.
- Field Data
- The unaltered survey data (readings and/or sketches) recorded in the field by whatever means, or verified copies thereof. (Accepted 2003-02-07)
Comments:
- For example, paper-based records, or decipherable images thereof which may have been altered but only to clarify the original data (for example, marked up photocopies), or data downloaded unaltered from an instrument (survey instrument, PDA, laptop, etc), or data still stored and observable in an instrument.
- Raw Data
- An initial unaltered digital copy of some or all of the survey Field Data, now ready for editing, validity checking, calculation or other processing. (Accepted 2003-02-07)
Comments:
- Where the Field Data was already in digital form, e.g. downloaded from an instrument, the Raw Data version could be an identical but editable copy, whereas the Field Data version would effectively be a read-only copy.
- Raw Data could include sketches now converted to fixed or editable digital form.
- The Raw Data could be in any format, including that of a survey processing program into which the Field Data has been typed.
- If such a program unilaterally alters the data in any material way as it is being entered, then the data has become Edited Data because it differs from the Field Data, i.e. a Raw Data version has effectively been skipped.
- Such a survey program might also store the data in a proprietary binary format unconducive to easy data exchange, therefore the coming CaveXML standard may need to define suitable forms of Raw Data to allow free exchange. Such binary data after being exported to a text format might qualify. Acceptable formats for raster or vector graphical data may also be required.
- Edited Data
- Raw data which has had or is having any kind of mistakes edited out of it, but no systematic instrument corrections have been applied. (Accepted 2003-04-05)
Comments:
- The data has been modified since its Raw Data version, but has not yet been certified as Accepted Data.
- Accepted Data - Corrected, Uncorrected and No Corrections
- The input data which is currently the accepted final version of individual Shots and any other measured data: Accepted Corrected if systematic instrument corrections have already been applied to the measurements, Accepted Uncorrected if systematic instrument corrections exist but have not been applied, Accepted No Corrections if no systematic instrument corrections were recorded. (Accepted 2003-04-05)
Comments:
- This data set is approved as now suitable for data reduction, but may or may not be ready for input to any particular survey reduction program depending on the program and what it can accept as input data.
- The "correction" terms refer to the application of systematic instrument corrections, not to adjustment of measurements in order to close any loops.
- Leg Data - Corrected, Uncorrected and No Corrections
- Accepted Data which provides only a single set of the necessary Measurements for each Leg, possibly by consolidation of several sets of accepted Shot data: Leg Corrected if systematic instrument corrections have already been applied to the Leg data, Leg Uncorrected if systematic instrument corrections exist but have not been applied to the Leg data, Leg No Corrections if no systematic instrument corrections were recorded. (Accepted 2003-04-05)
Comments:
- If only one Shot was taken between two adjacent Stations, then Accepted and Leg Data would be the same.
- Reduced Data
- Derived from Measurements taken during a Survey, Reduced Data is two or three dimensional co-ordinate data which gives the Position for Stations in the Survey. Any gross errors have been removed, and any systematic instrument corrections have been applied in earlier stages, but the co-ordinates have not yet been statistically adjusted for the distribution of the random errors inherent in any measurements. (Accepted 2003-04-27)
Comments:
- Typically co-ordinates would be northings, eastings and altitude based on true, grid, magnetic, or assumed Azimuths.
- Any gross errors would have been removed during the Editing phase, and any systematic errors would have been removed during the Correction phase.
- Adjusted Data
- Adjusted Data is Reduced Data after the application of statistical adjustment in order to distribute the random errors which remain after any gross and systematic errors have been removed. (Accepted 2003-04-27)
Comments:
- Typically this adjustment is done by closing any loops in the Survey.
- Adjusted Leg Data
- Adjusted Leg Data is fictitious Leg Data which has been generated from a set of Adjusted Data. (Accepted 2003-04-27, but a better descriptive name is needed.)
Comments:
- This is where for some reason simple leg data is required (e.g. to feed into a different program) but only co-ordinate data is available, so the leg data has to be "reverse engineered" from the co-ordinate data.
| Top
| CDX Home
| Main Task List
| Contents
| Task Description
| Plan
| Entity Defns
| Other Defns
| Relationships
|
5. Entity Relationships
The draft below describes how the various survey entities above could
relate to
each other. This is a text representation of the Entity-Relationship
Diagram. "Many" below means more than one. The Entities are shown
within square brackets [ ].
Alphabetically by entity:
[Branch]
- connects to two [Nodes]
- contains one or more [Legs]
[Cave]
- could belong to a [Cavesystem]
- could have initiated many [Surveys]
- could be recorded in one or more [Fieldbooks]
[CaveSystem]
- contains one or more [Caves]
- could have initiated many [Projects]
[Fieldbook]
- records one or more [Segments]
- could belong to one [Person]
- could belong to one [Organisation]
- could record many [Caves]
[Instrument]
- used in one or more [Segments]
- used by one or more [Methods]
[Interpoint]
- belongs to one [Shot]
- contains one or more [Measurements]
- located at one or more [Positions]
- could connect to many [Rays]
[Leg]
- belongs to one [Segment]
- connects two [Stations]
- contains one or more [Measurements]
- contains one or more [Shots]
- could be part of one [Branch]
- could be part of one [Spur]
[Map]
- is contributed to by one or more [Surveys]
- is contributed to by one or more [People]
- could be produced by many [Organisations]
[Measurement]
- used by one [Technique]
- could form part of one [Leg]
- could form part of one [Shot]
- could form part of one [Ray]
- could form part of one [Interpoint]
- uses one [Method]
[Method]
- used by one or more [Measurements]
- uses one or more [Instruments]
[Node]
- is located at one [Position]
- is connected to by one or more [Branches]
- could be connected to by one or more [Spurs]
[Organisation]
- associated with one or more [People]
- could be involved with many [Projects]
- could own many [Fieldbooks]
- could have produced many [Maps]
[Person]
- could be a member of many [Teams]
- could be associated with many [Organisations]
- could be performing many [Roles]
- could contribute to many [Maps]
- could be involved in many [Projects]
- could own many [Fieldbooks]
[Point]
- is coincident with one or more [Stations]
[Position]
- belongs to one [Segment]
- could have resulted from a calculation or loop adjustment by
one [Technique]
- could be the location for one [Station]
- could be the location for one network [Node]
- could be the location for one [Interpoint]
[Project]
- initiated for one [Cavesystem]
- initiates one or more [Surveys]
- involves one or more [People]
- could involve many [Organisations]
[Ray]
- could connect to one [Interpoint]
- could connect to one [Station]
- contains one or more [Measurements]
[Role]
- utilised by one [Segment]
- performed by one [Person]
[Segment]
- is surveyed on one [Trip]
- could be surveyed by one [Team]
- is surveyed using one [Technique]
- utilises many [Roles]
- is surveyed using one or more [Instruments]
- contains one or more [Positions]
- could contain many [Legs]
- is recorded in one or more [Fieldbooks]
- contains one or more [Stations]
[Shot]
- belongs to one [Leg]
- connects two [Stations]
- contains one or more [Measurements]
- could contain many [Interpoints]
[Spur]
- contains one or more [Legs]
- connects to one [Node]
[Station]
- is located by one or more [Positions]
- could be connected to by many [Legs]
- could be connected to by many [Shots]
- is coincident with one [Point]
- could connect to many [Rays]
- belongs to one or more [Segments]
[Survey]
- could include many [Caves]
- could be initiated by one [Project]
- could contribute to many [Maps]
- initiates one or more [Trips]
[Team]
- could survey many [Segments]
- consists of one or more [People]
[Technique]
- used for calculation or loop adjustment of one or more [Positions]
- used for surveying in one or more [Segments]
- uses one or more [Measurements]
[Trip]
- contributes to one [Survey]
- surveys one or more [Segments]
| Top
| CDX Home
| Main Task List
| Contents
| Task Description
| Plan
| Entity Defns
| Other Defns
| Relationships
|
Previous versions:
2004-09-26 |
2002-07-05 |
P. Matthews