PostGIS is an extension to the PostgreSQL relational database that provides spatial types,

indexes and functions, following the OGC “Simple Features for SQL” (SFSQL).

Starting the Suite

You can start and stop the OpenGeo Suite, and access components like PostGIS and

GeoServer, via the “Dashboard”.

Start the Dashboard from the Start Menu > OpenGeo (Windows) or Applications >

OpenGeo (OS/X).

When you first start the dashboard, it provides a reminder about the default password for

accessing GeoServer.


The PostGIS database has been installed with unrestricted access for local users (users

connecting from the same machine as the database is running). That means that it will

accept any password you provide. If you need to connect from a remote computer, the

password for the postgres user has been set to postgres.

First, we need to start up the Suite (which will start both PostGIS and GeoServer). Click

the green Start button at the top right corner of the Dashboard.


The first time the Suite starts, it initializes a data area and sets up template databases.

This can take a couple minutes. Once the Suite has started, you can click the Manage

option under the PostGIS component to start the pgAdmin utility.


Creating a Database

Loading Shapes into PostGIS

Loading Shapes into PostGIS...

Loading Shapes into PostGIS...

Using the Command Line

PostGIS System Tables



Spatial Queries



Spatial Indexes

Spatial Joins


PostgreSQL has a number of administrative front-ends. The primary is psql a

command-line tool for entering SQL queries. Another popular PostgreSQL front-end is

the free and open source graphical tool pgAdmin. All queries done in pgAdmin can also

be done on the command line with psql.

If this is the first time you have run pgAdmin, you should have a server entry for PostGIS

(localhost:54321) already configured in pgAdmin. Double click the entry, and enter

anything you like at the password prompt to connect to the database.


If you have a previous installation of PgAdmin on your computer, you will not have an

entry for (localhost:54321). You will need to create a new connection. Go to File >

Add Server, and register a new server at localhost and port 54321 (note the

non-standard port number) in order to connect to the PostGIS bundled with the

OpenGeo Suite.


Creating a Database

PostgreSQL has the notion of a template database that can be used to initialize a new

database – the new database automatically gets a copy of everything from the template. When

you installed PostGIS, a spatially enabled database called template_postgis was created.

If we use template_postgis as a template when creating our new database, the new

database will be spatially enabled.

Open the Databases tree item and have a look at the available databases. The

postgres database is the user database for the default postgres user and is not too

interesting to us. The template_postgis database is what we are going to use to

create spatial databases.


Right-click on the Databases item and select New Database.2.

If you receive an error indicating that the source database (template_postgis) is

being accessed by other users, this is likely because you still have it selected.

Right-click on the PostGIS (localhost:54321) item and select Disconnect.

You can then double-click the same item to reconnect and try again.

Fill in the New Database form as shown below and click OK.

Name postgis

Owner postgres

Encoding UTF8

Template template_postgis


Select the new postgis database and open it up to display the tree of objects. You’ll

see the public schema, and under that a couple of PostGIS-specific metadata tables –

geometry_columns and spatial_ref_sys – which we will discuss later.


Click on the SQL query button indicated below (or go to Tools > Query Tool).5.

Enter the following query into the query text field:

SELECT postgis_full_version();


This is our first SQL query. postgis_full_version() is management function

that returns version and build configuration.


Click the Play button in the toolbar (or press F5) to “Execute the query”. The query will

return the following string, confirming that PostGIS is properly enabled in the database.


You have successfully created a PostGIS spatial database!! Now do a spatial calculation

just to make sure. Copy the following into the SQL window:

SELECT ST_Length('LINESTRING(0 0, 1 1)');

Our first spatial query constructs a diagonal line across a one-unit square. The length of

that line is sqrt(2), or 1.4142.


Loading Shapes into PostGIS

The workshop data files are public domain data from the City of Medford, Oregon. The files are

located in the data/ directory of the workshop. The projection of the data is NAD83 State Plane

(Oregon South) in feet, more succinctly and opaquely known as EPSG:2270. The files are:

school_pt.shp a small point file of school locations

road_ln.shp a large line file of street centerlines

taxlot_ply.shp a large polygon file of taxable property parcels

We will load our example data into PostGIS using the pgShapeLoader tool in to convert from

Shape files to PostGIS tables.

From the PgAdmin Plugins menu, select PostGIS Shapefile and DBF loader.1.

The loader still start with the connection information for your current PgAdmin database.

Click the “Test connection...” button to ensure you can connect to the database.

Now, click on the button in the “Shape File” area, and browse to the data directory.

Select the “school_pt.shp” file, and click “Open”.


Next, change the value of the SRID field to 2270.3.

Finally, click the “Import” button to start the process.4.

Repeat the process for “road_ln.shp” and “taxlot_ply.shp”. These are much larger files. To

make the load process go faster, open the “Options...” dialogue and click the “Load using

COPY rather than INSERT” option on before running the import.


Loading Shapes into PostGIS... Using the Command Line

PostGIS ships with a command-line utility for loading shape files into the database, called

shp2pgsql, as well as a utility for exporting tables to shape files, call pgsql2shp.

If you completed the process with PostGIS Shapefile and DBF loader above, you do not

need to run these commands – the data is already loaded into your database.

Enter the workshop data directory, set the PATH environment variable to include the

PostgreSQL executables directory, and then run the data loading commands. shp2gpsql

converts the shape file into a SQL text file suitable for loading into the database. psql loads

the text file into the target database.

# set PATH=%PATH%;C:\Program Files\OpenGeo\OpenGeo Suite\pgsql\8.4\bin

# shp2pgsql -p 54321 -I -s 2270 -D road_ln.shp road_ln > road_ln.sql

# psql -f road_ln.sql -d postgis

# shp2pgsql -p 54321 -I -s 2270 -D taxlot_ply.shp taxlot_ply > taxlot_ply.sql

# psql -f taxlot_ply.sql -d postgis

# shp2pgsql -p 54321 -I -s 2270 -D school_pt.shp school_pt > school_pt.sql

# psql -f school_pt.sql -d postgis

PostGIS System Tables

PostGIS follows the OGC SFSQL (Simple Features for SQL) specification, which means it

includes two standard system tables of metadata: SPATIAL_REF_SYS and



The SPATIAL_REF_SYS table contains information about “spatial reference systems” –

combinations of geographic systems (ellipsoids, datum) and projected systems (projections,

parameters) that are used for real-world mapping. “Transverse mercator” is an example of a

projection, and WGS84 is an example of a spheroid, but “UTM Zone 10 North, NAD 83” is an

example of a full spatial reference system.

Table "public.spatial_ref_sys"

Column | Type | Modifiers


srid | integer | not null

auth_name | character varying(256) |

auth_srid | integer |

srtext | character varying(2048) |

proj4text | character varying(2048) |


"spatial_ref_sys_pkey" PRIMARY KEY, btree (srid)

Each row in the SPATIAL_REF_SYS table corresponds to one spatial reference system. The

srid column is the unique identifier, and is considered “internal” to the database. The

auth_name and auth_srid are the external authority and authority number. The authority is

usually “EPSG” and the table that ships with PostGIS matches the srid to the auth_srid for


The srtext is the OGC “well-known text” representation of the spatial reference system. The

proj4text is the representation consumed by the Proj.4 reprojection library PostGIS uses to

provide on-the-fly reprojection. Because only the proj4text is used internally by PostGIS, it

is usually safe to omit the srtext when adding new entries, but be aware that external

programs may use the srtext to determine the projection of a particular table.


The GEOMETRY_COLUMNS table contains information about the spatial columns in a database.

Table "public.geometry_columns"

Column | Type | Modifiers


f_table_catalog | character varying(256) | not null

f_table_schema | character varying(256) | not null

f_table_name | character varying(256) | not null

f_geometry_column | character varying(256) | not null

coord_dimension | integer | not null

srid | integer | not null

type | character varying(30) | not null

Each row in the table corresponds to one spatial column. Tables may have multiple spatial

columns. Client software such as QGIS and uDig often use the GEOMETRY_COLUMNS table to

figure out which columns to display to the end user as “layers” suitable for viewing on a map.

The first four columns (f_table_catalog, f_table_schema, f_table_name,

f_geometry_column) serve to uniquely locate the geometry column. The next three

describe the spatial metadata:

coord_dimension provides the dimensionality (2, 3, or 4 dimensions are supported in


srid provides the spatial reference system and must refer to a valid row in the


type provides the geometry type (point, linestring, polygon, etc).

Note that the GEOMETRY_COLUMNS table is not automatically updated as you create and drop

tables. You must manually keep it up to date.

One way to keep the table up-to-date is to religiously use the AddGeometryColumn()

function when managing DDL in spatial tables. This function takes in all the information

necessary to create a new column, performs the creation, and adds a metadata record:

SELECT AddGeometryColumn(








Another way to keep the table up-to-date is to use helper functions. PostGIS 1.4 and higher

provide the Populate_Geometry_Columns() function, which checks for validity and also

fills in missing entries.

-- PostGIS 1.4

SELECT Populate_Geometry_Columns();



probed:3 inserted:3 conflicts:0 deleted:0

(1 row)

Spatial Queries

We will now construct some queries of our spatial database, using “spatial SQL” functions

provided by PostGIS (and any other SFSQL spatial database). For a reference list of functions

we will be using, see the PostGIS Functions section.


The taxlot_ply table contains 91,343 parcel polygons. It also includes a large number of

attributes about each parcel, including:

impvalue (improvement value)

landvalue (land value)

acreage (reported acreage)

yearblt (year built)

feeowner (name of the owner)

state (state of residence of the owner)

We can use the ST_Area() function in combination with these attributes to ask some

questions of the taxlot_ply table. Open the PgAdmin SQL window and enter the following

queries into database.

What is the area in acres of all parcels in the database?

SELECT Sum(ST_Area(the_geom)) / 43560

FROM taxlot_ply;

Answer: 1772888

What is the area in acres of parcels built on since 2000?

SELECT Sum(ST_Area(the_geom)) / 43560

FROM taxlot_ply

WHERE yearblt >= 2000;

Answer: 27176

What is the value per square foot of all parcels?

SELECT Sum(landvalue + impvalue) / Sum(ST_Area(the_geom)) as

FROM taxlot_ply;

Answer: 0.41

What is the value per square foot of all parcels held by out-of-state owners?

SELECT Sum(landvalue + impvalue) / Sum(ST_Area(the_geom)) as

FROM taxlot_ply

WHERE state != 'OR';

Answer: 0.38

Measurement is not limited to areas. We can also use linear measurements to characterize the

roads in the county.

What is the break down of road types in the county?


Sum(ST_Length(the_geom)) / 5280 as miles,

Count(*) as nsegments,


FROM road_ln


ORDER by cfcc;


So far, our queries have calculated one metric or a summary against every record in the

database. Databases are commonly used to store very large tables – larger than can be stored

in memory – and efficiently access sub-sets of those tables.

First, let’s find out the coordinates of the first school in our school_pt table:

SELECT ST_AsText(the_geom) FROM school_pt WHERE gid = 1;

Answer: POINT(4387009 402407)

Now, let’s take that point, and find the average property value in a one-mile (5280 foot) radius.

SELECT Sum(landvalue + impvalue) / Count(*) as avg_value

FROM taxlot_ply




ST_GeomFromText('POINT(4387009 402407)', 2270),



Answer: 161,094

There are a number of things going on in this query:

The ST_GeomFromText() function is used to build a geometry object from the text

representation of a point. Note that the SRID is also set to 2270 at the same time, to

match the SRID of our data tables.

The ST_DWithin() function is then used to test every geometry against the query

point, and return true only if the geometry was within 5280 units (feet).

Finally, only those records that passed the distance test were fed into the calculation of

the average property value: total value divided by number of properties.

Spatial Indexes

The PostGIS spatial index is an r-tree index, implemented on top of PostgreSQL’s GiST access

method infrastructure.

An “r-tree” (and any other spatial index) works by sorting the bounding boxes of features into a

quickly searchable tree. Because the features themselves are not indexed, just the bounding

boxes, all queries that use spatial indexes must proceed in two phases. First, the spatial index

is used to generate a subset of records that might match a spatial condition; then, an exact test

is used on just that subset to produce the final output set.

The “r-tree” index uses nested rectangles (in the two-dimensional case, cubes and hypercubes

for higher dimensions) to sort the features into a quickly searchable tree.

To create a spatial index in PostGIS, use the CREATE INDEX [indexname] ON

[tablename] USING GIST ( [geometry] ) command. For example, to index our three

example tables, you would use the following commands.

Let’s compare an unindexed and indexed query for speed.

First, drop the spatial indexes on your tables.

DROP INDEX school_pt_the_geom_gist;

DROP INDEX taxlot_ply_the_geom_gist;

DROP INDEX road_ln_the_geom_gist;


Run the average property query, and see how fast it executes:

SELECT Sum(landvalue + impvalue) / Count(*) as avg_value

FROM taxlot_ply




ST_GeomFromText('POINT(4387009 402407)', 2270),




Now, add the spatial indexes back onto your tables, and run the query again.

CREATE INDEX school_pt_the_geom_gist ON school_pt USING GIST (the_geom

CREATE INDEX taxlot_ply_the_geom_gist ON taxlot_ply USING GIST

CREATE INDEX road_ln_the_geom_gist ON road_ln USING GIST (the_geom


The unindexed query logs an execution time of over 1000ms, while with the indexes, a time of

less than 50ms is achieved.

Spatial Joins

With spatial indexes in place, we can perform spatial joins quickly – taking information from two

distinct tables and joining it together on the basis of spatial relationships.

Our last query determined the average property value within a one-mile radius of a single

school. We can use a spatial join to determine the property value within a one-mile radius

for all schools. Or, to keep the result set smaller, just the high schools.

SELECT AS school_name,

Sum(t.landvalue + t.impvalue) / Count(*) AS avg_property_value

FROM taxlot_ply t, school_pt s


ST_DWithin(t.the_geom, s.the_geom, 5280)


s.type = 'High School'


ORDER BY avg_property_value DESC;

And now we know where to send our kids to school in Medford.


These have been a very few examples of using spatial SQL for querying a database. In the

remaining sections of the workshop, most of the querying will happen behind the scenes, as

tools like GeoServer pull data from the database.

However, the power of the spatial database for analysis and querying remains easily available

via scripting languages and direct user tools like PgAdmin to quickly analyze or automate

geospatial tasks.

