The introduction of SQLCLR in SQL Server 2005 allowed for very rich User Defined Types to be utilized. This meant that a developer could create a single object that contained multiple data points (properties) and could also perform calculations internally (methods), yet store that instance in a single field of a single row in a database table. Suddenly, any type of data could be stored and queried in the database, instead of just strings and numbers.
Fast forward to the present time, as Microsoft is celebrating the launch of SQL Server 2008 (notice that in the brilliance that is known as Marketing Speak, the term "Launch" does not mean "Release", so the product is still under development). We're seeing the realization of the SQLCLR UDT concept being branded as being "Beyond Relational". In this upcoming release, developers will have access to two new spatial data types that are included with the product: Geometry and Geography.
Think back to High School. No, I'm not talking about those painful memories of being on the receiving end of wedgies, or the awkwardness and loss of words that came over you when you were around members of the opposite gender. I'm referring to Mr. Bollenbacher's Geometry class, where you learned about all sorts of useless things that you would never again need to use...
High School Geometry class, for the most part, involved working in a single plane. A good part of the curriculum taught us mathematical proofs, and constructions using a compass and straight-edge. But, we also pulled out the graph paper at times, and plotted points using a Cartesian Coordinate System, then connected two points points to form a line, and then connected lines to create closed shapes, or polygons.
The Geometry spatial data type in SQL Server 2008 is used to represent information in a uniform 2-dimensional plane, much like the graph paper analogy from High School. The units are completely user-defined, and could be inches, miles, pixels, or even picas. The point is that SQL Server doesn't care; it can perform calculations just the same.
You would use the Geometry data type in conjunction with some existing planar projection of the real world that becomes the underlying reference. That is, you would usually start with an existing diagram or a map, and define your spatial data in terms of that base map.
For example, perhaps your user-defined coordinate space is being used to represent a warehouse facility. Within that coordinate space, you can use the Geometry data type to define polygons that represent storage bays within the warehouse. Data in your tables can track where inventory is placed using these polygons. Then, it becomes possible to do things with the data, like determine where the closest item of a certain type is to a forklift driver's present location.
The most important thing to keep in mind about the Geometry data type is that it works with uniform units of data. The number of units across the top of the plane is the same as the number of units across the bottom, and the same applies to the left and right sides.
Perhaps you have not yet heard, but the Earth is actually not flat. It's more like a ball. Take a minute to absorb that information... I'll wait.
Since the Earth is not flat, it is challenging to represent real-world locations using 2-dimensional mathematics. Sure, we're used to looking at flat maps of the world, whether it's that road map folded up in your car, or the rolled-up atlas that hung over the chalk board in school, or even a Google map displayed on your computer screen. But, those are merely projections of the real world onto a plane.
There are multiple ways to create such projections using mathematical transformations, with each type having benefits and disadvantages over other types of projections. For instance, one type of projection might attempt to minimize the distortion for a particular area of the globe, while other areas towards the outer edges of the projection become very distorted. Or, a projection might make it easier to represent points on the globe in a uniform fashion, but areas near the North and South poles (like Greenland and Antarctica) appear much larger in area than they actually are.
For this reason, representing geospatial information using the Geometry data type is not practical, especially if you wish to share information with other people who might be using a different projection. Fortunately, we are able to use a Geographic Coordinate System to accurately represent locations on Earth, or any ellipsoidal body for that matter.
The SQL Server Geography data type uses Latitude and Longitude angles to identify points on the Earth. Latitude measures how far North (or South) of the Equator a point is, while Longitude measures how far East (or West) of a Prime Meridian a point is. Note that this coordinate system can be used to identify points on any sized "ball", be it a golf ball, the Earth, or even the Sun.
Spatial Reference System
Latitude and Longitude are thought of as being absolute values that are universally accepted, but this is not the case. For instance, the Prime Meridians, or zero degrees Longitude, is quite arbitrary in nature. It is commonly referred to as being a North-South line that passes through Greenwich, London, England, but in truth, there are four different Prime Meridian that have been used historically that actually pass through Greenwich. Even the current Prime Meridian that is used by the GPS system (and most Internet mapping sites) is actually 102.5 meters to the East of the "official" line that marks the otherwise universally accepted Prime Meridian.
Once you know the exact starting points of the coordinate system that a set of data uses, we're left with another dilemma when we try to calculate the distance between two points. Because of the action of the Earth spinning on its axis, it is not actually spherical in shape, but rather ellipsoidal (it's fatter in the middle than at the poles). That complicates the ability to accurately calculate distance and area in a uniform manner, but doesn't make it impossible.
To measure distance between two points on the globe, we need to know the radius of the Earth (i.e., how far from the center of the Earth that each point is). The problem is that because of the land masses, the Earth doesn't have just one uniform radius that can be applied globally. If you measure the distance between two points that are one degree apart at sea level, you will get a different result than two points that are one degree apart in the mountains, which might be 5000 feet above sea level.
So, when a location is surveyed and points are identified in terms of global coordinates, these are based on a model of the earth that is quite accurate for that localized area, but maybe not as accurate elsewhere. The parameters that define the particular model of the Earth is known as the Spatial Reference System. It is important to know which Spatial Reference System is used for a set of data, because you cannot always mix data from one set with another and get accurate results.
With the pervasiveness of GPS data, I would argue that the most commonly used Spatial Reference System today is WGS84.
This first article was to establish a baseline of knowledge about some of the spatial concepts. Going forward from here, I will describe the standards that have been implemented in the creation of the Geometry and Geography data types, as well as how to actually use the data types programmatically.
(next part) SQL Server 2008: Spatial Data, Part 2
SQL Server 2008: Spatial Data, Part 3
SQL Server 2008: Spatial Data, Part 4
SQL Server 2008: Spatial Data, Part 5
SQL Server 2008: Spatial Data, Part 6
SQL Server 2008: Spatial Data, Part 7
SQL Server 2008: Spatial Data, Part 8