 |

Representation and
Data Models
|
Dr. Kenneth McGwire
DRI Biological Sciences Center
7010 Dandini Blvd., Reno, NV 89512
Direct Comments to: kenm@maxey.unr.edu
Introduction
S everal data structures have
been developed for manipulating map-based data in a GIS. These structures
fall within two predominant data models, either field-based or polygon-based.
Field-Based Model
The field-based model describes spatial variability using
a fixed sampling framework. The raster data structure, as found in remotely
sensed data sources, is the most common implementation of this representational
model. Typical data structures based on the field-based model include:
Polygon-Based Model
The polygon-based model stores the coordinates of those points
which define the precise boundaries of mapped objects, and has generally
become more common in the GIS arena. Objects are represented as points
(e.g wells), lines built up from multiple points (e.g. transportation networks),
and polygons built up from multiple lines (e.g. soil units). Data structures
based on the polygon-based model include:
-
Arc/Node
-
Triangulated Irregular Network (TIN)
Differences Between Field and Polygon-Based Models
The fundamental difference between the field-based and polygon-based
models involves the emphasis on attributes.
Field-Based Model
The field-based model is sometimes referred to as being
"map-oriented" since the management of attributes is handled by an a pre-defined
sampling scheme, regardless of the shape of the objects in a map.
This emphasis on a systematic spatial sampling framework
makes field-based data structures most effective for those data whose attributes
vary continuously over space. There is no explicit identification of the
contiguity of neighboring attributes in the field-based model.
Polygon-Based Model
The polygon-based model is sometimes referred to as being
"object-oriented" since attribute data are tagged directly to homogenous
objects whose shape has no predetermined limits.
The focus on homogenous regions in the polygon based model
allows the attributes of extensive, homogenous features to be managed more
efficiently than the field-based approach. Attributes of a feature can
be tracked using a single identification code, and the contiguity of these
attributes over space is explicit in the data model.
Managing Attribute Data
GIS's using the polygon model often have sophisticated methods
for managing attribute data.
-
Vector based systems are often joined to a relational database
management system (RDBMS)
-
The RDBMS may be used to keep track of the topological relationships
which join points into lines and lines into polygons.
-
Systems using an RDBMS allow tabular data from other sources
to be linked to map features.
-
A RDBMS allows sophisticated queries regarding the multiple
attributes of features in the GIS.
Definition: Topology
The logical basis for determining connectivity of
points, lines, and areas in the GIS. Polygon-based data models must rigidly
enforce topology in order to determine whether arbitrarily located points
are located inside or outside of a given polygon.
Accuracy
The choice of data structure affects the accuracy of spatial
representation.
-
The most accurate data structure is dependent on the features
being sampled.
-
The polygon model is most useful in cases where changes in
land characteristics occur along well defined boundaries between relatively
homogenous spatial units.
-
Because vector-based systems define objects by their boundaries,
geometric representations of homogenous targets are generally more accurate
than raster systems.
-
Mapped boundaries are often rough approximations of broad
zones of transition and the regions they define are often far from homogenous.
-
Those features which display continuous variation over space,
or for which the scale of observation makes precise taxonomic determination
impossible, might be quantified most effectively with the systematic sampling
of the raster structure.
Conversion
In an integrated remote sensing and GIS environment, both
data models must often be used in conjunction. This often requires a conversion
between raster and vector data structures.
Problems with converting vector maps to a raster structure
include:
-
The creation of stair-stepped boundaries
-
Small shifts in the position of objects
-
The deletion of small features
Problems with converting raster maps to a vector structure
include:
-
Potentially massive data volumes
-
Difficulties in line generalization
-
Topological confusion
Vector-to-Raster Conversion
-
The fixed grid of the raster data structure invariably leads
to a jagged, or stair-stepped, representation of polygon boundaries which
are oriented at an angle to the grid.
-
By forcing real world features into a fixed raster grid,
feature boundaries will shift by as much as half the dimension of the grid
cells.
-
The typical conversion rule of assigning values using that
class which occupies the greatest proportion of the grid cell may result
in the deletion of features which are smaller than a grid cell in either
the X or Y dimension.
-
Not only might small patches be deleted, but extensive narrow
features such as roads or streams may drop out as well.
Raster-to-Vector Conversion: Data Volumes
Raster maps which contain a high degree of fragmentation
can create very large data volumes when converted to the polygon-based
data model.
In the extreme, a single pixel with a value different
from all its neighbors may require a single byte in a raster structure.
In a vector structure there would be:
-
4 sets of X,Y coordinates for each corner,
-
Each X and Y would require at least a 4 byte floating point
value
-
Depending on the implementation, the database might require
additional fields to:
-
Link these points into line segments
-
Link the line segments into a single polygon
-
Create an extra record in at least one attribute table.
Raster-to-Vector Conversion: Line Generalization
-
The degree of generalization in the boundaries of features
in vector based maps is generally tied to the scale of data acquisition.
-
Raster-to-vector conversion routines may have difficulty
in identifying the those points along a raster boundary which best define
a shape.
-
If all points are used in the conversion, data volumes will
be larger than required, and features will retain an undesirable stair-stepped
artifact along angled lines.
-
Vector based GIS packages usually provide functions for line
generalization.
Raster-to-Vector Conversion: Topological Confusion
The contiguity of neighboring pixels is not explicit in the
raster data structure. However, vector data structures must explicitly
determine the connectivity of the points which define an object.
Two problems may cause ambiguity regarding the connectivity
of pixels in the conversion process:
-
The space between two similar objects may be less than a
pixel in width, so there will be no obvious boundary between them in a
raster file. After conversion the vector based GIS will treat these two
objects as a single feature.
-
There may be multiple ways in which pixels might be connected
when features are a single pixel in width or are only connected across
a diagonal.
Methods for mitigating problems in data conversion
-
Filter raster data prior to performing a conversion to vector
data, so that there are no single pixel polygons or ambiguous connections
between proximate features.
-
Use a weighting scheme when converting vector maps to a raster
structure so that important classes do not drop out when they cover small
areas. This will unfortunately bias the area statistics of the resulting
map.
-
Select the class located at the center of each pixel location
when converting from vector to raster. This will tend to maintain smaller
features and will preserve overall area statistics. However, this method
in no way guarantees that the label of a specific pixel corresponds to
the predominant class at that location.
-
Use line generalization functions in the GIS after converting
a raster file to a vector data structure. This will reduce unnecessary
points, but may also distort shapes in undesirable ways if not used with
caution.
-
Use additional data layers in the GIS to provide guidance
when editing topology of features after raster-to-vector conversion (feature
snapping).
Go to Section 9.3 - GIS Input
and Update