A View Inside My Head

Jason's Random Thoughts of Interest


SQL Server 2008: Spatial Data, Part 2

In the first part of this series, I provided a somewhat verbose description of the two different types of Spatial data supported by SQL Server 2008: Geometry (flat Earth model) and Geography (ellipsoidal model).  Now, let's dive in a bit deeper, and take a look at some of the specifics.


From their website, the Open Geospatial Consortium, Inc. (OGC) is described as "a non-profit, international, voluntary consensus standards organization that is leading the development of standards for geospatial and location based services."  Unlike me, these folks are the real experts in the field who understand both the problems and the solutions of working with spatially-aware systems.  By defining standards, they help to ensure that geospatial data can be shared between different vendors and platforms.

SQL Server 2008's Spatial data types are based on the OGC's "Simple Feature Access" standards, which are described in more detail than you would likely ever care to know by two documents in particular:

Note that like any standard, there is no guarantee that everything declared will be implemented - it just tries to guarantee that if something is implemented, that it is done in a way that is compatible with other implementations.  Likewise, there's nothing to stop a vendor from adding their own extensions to an implementation that provide functionality above and beyond what is defined in the standard.

Specific Types of Spatial Data

The Geometry and Geography data types in SQL Server 2008 are actually abstract by themselves.  When you instantiate an object of this type, it becomes one of a number of possible concrete types (with each having its own set of characteristics).  The following diagram (from the SQL Server BOL) displays the inheritance tree of the various concrete types (in blue) and their abstract ancestors (in yellow):

 Spatial Types - Click to Enlarge 

As far as SQL Server is concerned, though, we must define a variable or a column in a table as a Geometry (or Geography) type.  There is no means to be more specific at the time of declaration, and it is the job of the Geometry (or Geography) type itself to be aware of the deriving types like Point and LineString. 

This also means that a column in a table can have a mixture of different geospatial information: one row could contain a single point indicating where you are right now, another row could contain the surface of the state of Ohio, and a third row could contain a line that represents the route between your home and workplace.  We recognize these three things as unique types of geospatial information, but SQL Server thinks of them as the same things. 

Note: If you are not familiar with the concepts of inheritance and polymorphism from Object Oriented Programming, then consider this to be the same as how humans are able to generalize a dog and a cat both as being an animal.

POINT: A Point is an exact location, and is defined in terms of an X and Y pair of coordinates, as well as optionally by a Z (elevation) and M (measure) coordinate.  It does not have a length or any area associated with it.  Points are used as the fundamental building blocks of more complex spatial types.  Note: Z and M are recognized and maintained by SQL Server 2008 if supplied, but are not used in any calculations).

LINESTRING: A line segment is the shortest path between two points.  A LineString, then, is defined as the path between a sequence of points (i.e., a series of connected line segments).  It is considered simple if it does not cross over itself, and is considered a ring if the starting point is the same as the ending point.  A LineString is always considered to be a one dimensional object; it has length, but does not have area (even if it is a ring).

POLYGON: A Polygon is a closed two-dimensional shape defined by a ring.  It has both length and area.  A Polygon may also have holes in its interior (a hole is defined by another Polygon).  Area within a hole is considered to be exterior to the Polygon itself.

COLLECTIONS: In addition to the single instance types (Point, LineString, and Polygon), there is also a type that can hold a collection of instances.  This is similar to a list or an array in most programming languages.  The most generic type of collection is the GeomCollection, whose members can be of any type.  Deriving from GeomCollection are MultiPolygon, MultiLineString, and MultiPoint.  As their names suggest, the members of these collection types must be of the single instance type (i.e., MultiPoint can only contain Points, etc).

Representing Spatial Data

The OGC defines different ways to represent geospatial information as bytes of data that can then be interpreted by the Geometry or Geography types as being Points, Linestrings, etc.  SQL Server 2008 supports three such formats: Well-Known Text (WKT), Well-Known Binary (WKB), and Geography Markup Language (GML). 

For this series of articles, I will primarily be using WKT in my examples since is both concise and [relatively] human-readable.

The syntax of WKT is not difficult to understand, and can usually be inferred simply by looking at some examples:

POINT(10 10)
POINT(10 10 10 1) // X Y Z M shown here LINESTRING(0 0, 10 10) POLYGON((0 0, 0 10, 10 10, 10 0, 0 0))

People with backgrounds in mathematics may be compelled to put commas between each coordinate, but in WKT, a space is used to separate the coordinates of a point, and a comma is used to separate points within a more complex type.  Parentheses are used to group points into a single shape. 

As a more complex example, here is how you would represent the above Polygon with a triangle-shaped hole in it:

POLYGON((0 0, 0 10, 10 10, 10 0, 0 0), (1 1, 4 5, 4 1, 1 1))

The first group of points defines the Polygon's ring, while the second set defines the hole within the Polygon's interior.

Though, as the old saying goes, a picture is worth a thousand words.  Therefore, it is often easier to plot out geospatial information for human consumption, so our Polygon would look something like this:

POLYGON((0 0, 0 10, 10 10, 10 0, 0 0), (1 1, 4 5, 4 1, 1 1))

(Note: In this particular visualization, the point (0 0) is considered to be the upper-left corner, and the units increase down and to the right)

SQL Server 2008: Spatial Data, Part 1

(next part) SQL Server 2008: Spatial Data, Part 3

SQL Server 2008: Spatial Data, Part 4

SQL Server 2008: Spatial Data, Part 5

SQL Server 2008: Spatial Data, Part 6

SQL Server 2008: Spatial Data, Part 7

SQL Server 2008: Spatial Data, Part 8

kick it on DotNetKicks.com

Pingbacks and trackbacks (1)+