Part 1 - An Overview of GIS

The goal of part 1 is to provide you with a basic foundation in GIS concepts and software in preparation for the rest of the tutorial.

I. Basic GIS Concepts II. GIS Software III. Open Source

Section I: Basic GIS Concepts

Geographic Information Systems (GIS) are an integrated collection of software and data used to visualize and organize geographic data, conduct spatial analysis, and create maps and other geospatial information. Narrow definitions of GIS focus on the software and data, while broader definitions include hardware (where the data and software is stored), metadata (data that describes the data), and the people who are part of the system and interact with it as creators, curators, and users.

Another definition: GIS is a visual system that organizes information around the concepts of place and location that can be used for geographic analysis, map making, database management, and geospatial statistics. GIS can be (and has been) applied to virtually any discipline or endeavor.

In a GIS, geographic features are represented as individual files or layers that can be added to a map. These features are not maps in and of themselves, but are the raw materials used for map making and analysis. For much of the 20th century cartographers drew geographic features on individual mylar or acetate sheets and then layered those sheets over a paper base map to create maps. GIS uses the same principles of layering, with individual files consisting of features that can be layered on top of each other in GIS software. GIS software acts as an interface, or window, for viewing and manipulating GIS data. The ability to add different layers is quite powerful, as combining the layers allows for analysis that would be impossible if you were viewing single layers by themselves (see example below of air photo, flood zones, and hazardous site layered):

ArcGIS with several layers

Each GIS file is georeferenced, meaning that the file is actually tied and related to real locations on the earth. Just as paper maps were drawn based on map projections and coordinate systems, each GIS file has also been created based on a particular projection and coordinate system, which means that files that share the same reference systems can be laid on top of each other. Since projections and coordinate systems are highly standardized, GIS data can easily be shared. If two files do not share the same system, most GIS software can convert files from one system to another so they'll match. This distinguishes map making in GIS versus a graphic design package. Maps created in a graphic design package are just simple lines and shapes with no connection to the earth, and the components of the map can't be easily replicated to make other maps. GIS files used to create maps in a GIS package can readily be shared and used to create any map, because they are tied to the earth using standardized systems.

Georeferenced Hawaii feature

GIS files are stored in several formats, and each format comes in several different file types. Major formats and file include:

Satellite imagery Scanned paper map - USGS Topo Land use & land cover raster
Vector point shapefile Vector line shapefile Vector polygon shapefile
Spreadsheet with geographic data

Raster and vector GIS files exist spatially, in that you can see the grid or shapes and their corresponding location on the earth, but also exist in tabular form. This is particularly valuable in the case of vector files. For example, every feature in a vector file showing country boundaries has an attribute table attached to it that has a record for each country. This attribute table contains columns or fields that store values for each country, such as the country's name, values like population or area that describe it, and ID codes that uniquely identify each one. The names can be used by the GIS to label each country, and the values like population can be thematically mapped.

A shapefile with attribute table

The ID codes for each country can be used to join the attribute table for the GIS file to a tabular file that contains country-level data. For example, a GIS file of country boundaries with a country code can be joined within the GIS using relational database techniques to a text or spreadsheet file that has country-level data and that uses the same country codes to identify each country. The data in the table, which was just a regular table with no geospatial geometry, can now be visualized and mapped in GIS. There are number of standard ID codes that have been created which can be used for joining data. The two most common families of codes are FIPS (created by the US government to identify every single geographic entity in the US; there are also FIPS codes for countries) and ISO (created by the International Standards Organization to identify countries and their subdivisions).

Joining features based on common identifiers

Section II: GIS Software

A standard interface for GIS software has evolved over time. Typically, GIS software has a data view that consists of a table of contents that lists files that have been added to a project, a data window that displays the GIS files, and a set of toolbars and menus for accessing various tools and launching various processes. Dragging the layers in the table of contents changes the drawing order of the layers, and right or left clicking on a layer in the table of contents will reveal individual properties for that particular feature. You can also access the attribute table of the feature and a symbol tab for changing how the features are depicted or classified. There are several tools for zooming in and out to examine different layers and to change the extent of the view.

QGIS

The way that coordinate systems and projections are handled is different for individual GIS software packages. In general, the options are: define the projection and coordinate system for the project before adding the files, or the project automatically takes the projection of the first file added. If you try to add GIS files that have different projections, some software may try to re-project the data on the fly, while others will simply fail to draw the new layers. Even if the software can correctly draw a layer without the user defining it, or even if it can re-project layers on the fly, users will run into problems later on when trying to manipulate the GIS files. You should always be sure to define the projection properly and make sure that all files share the same one - most GIS software will give you the ability to re-project data.

GIS software provides users with a variety of ways for querying geographic data, either by selecting records in the attribute table or shapes in the view, or by conducting searches where you build queries to high-light features that contain specific attributes, or that have some relationship with another geographic layer.

GIS software comes with a variety of editing tools that allow you to modify the geometry of GIS files. For example, you can merge features together, break them apart, or clip out or select certain areas to create new files. Collectively these processes are known as geoprocessing. You geoprocess layers in order to prepare raw data for analysis, to create new layers or data, or to simplify layers for cartographic or aesthetic purposes. GIS also provides the ability to edit files on a feature by feature basis.

Most GIS programs have a separate map layout or print layout, where the user can create finished maps with standard map elements like titles, legends, scale bars, north arrows, and accompanying text. Finished maps can be exported out of the GIS as static files, such as pdfs or jpgs.

Users can always save their GIS projects in a GIS project file. The scale and extent of the data view, symbolization and classification assigned to layers, map layouts, and links to GIS files used in the project are stored in the file. It's important to understand that the GIS files themselves are NOT stored inside the project file - the GIS data and the GIS project file exist independently. When adding data to a GIS, you are establishing a link from the GIS project to the GIS data - the GIS data is not stored within the project. Furthermore, changing the colors of the features or classifying them in a certain way has NO EFFECT on the actual GIS data files themselves. When you change symbols, you are only changing how the GIS program views the data - you're not changing the data itself.

This is an important concept to grasp. Essentially, the GIS software acts as a window for viewing and working with GIS data, which is stored outside the window. The GIS project file essentially stores the window dressing, of scale and symbolization. You never actually change the GIS data unless you go into an edit mode or conduct an operation that creates a new GIS file. This relationship is of crucial importance when it comes time to move or share files - if you move your project file or your data, the links between them will become broken, and you'll need to re-establish the location between the project and the data in order to repair your project file.


Section III: Open Source

In this tutorial we will be using QGIS, which is free open source software (FOSS). Open source software is an alternative to proprietary software:

Open source software can be created in several ways. A programmer or developer creates software from scratch, because they have some need that isn't being met by current software. Over time, as other programmers discover the project they may choose to contribute to building or improving this software, and they rally around the creator and begin to form a group that becomes devoted to the project. The Linux operating system and the Perl programming languages essentially began this way. Alternatively, a group of people who receive support from a business or entrepreneurs take software that was formerly proprietary but is no longer commercially viable, and they build on this product and re-release it as open source. The Mozilla Firefox browser (formerly the proprietary Netscape) and Open Office (formerly the proprietary Star Office) are examples of the latter.

Why would people want to bother with creating FOSS software?

The number of FOSS GIS packages has grown over the course of the last decade. In this tutorial we will be using Quantum GIS (QGIS), which was initially developed by a group of volunteers in 2002 as a simple GIS viewer but has evolved into one of the premier FOSS GIS packages.

The advantage of using QGIS for this tutorial: it's free, you can download it yourself if you have your own computer, it runs on any operating system, it is mature enough that it supports most essential GIS tasks plus a few intermediate and advanced ones, and it's relatively easy to use.

The disadvantage is that QGIS can't do everything that proprietary software can, is still working out some bugs, and doesn't have the name recognition that software like ArcGIS or MapINFO do. There also isn't as much in the way of documentation or tutorials for QGIS relative to the other options, but this is changing.

Open software tends to be modular rather than monolithic; you often have several, independent software applications to perform different functions, rather than one, large piece of software that does it all. A typical FOSS GIS workstation may include several applications like QGIS (for viewing data, basic analyses, map making, generally working with vector data), GRASS (a more advanced GIS for doing analyses and modeling and for working with raster data), GDAL / OGR (command line tools for converting files and projections and for basic queries), and a geodatabase application (PostGIS for server-based databases and Spatialite / SQLite for desktop use).

ArcGIS, created by a company called ESRI, has been on the market for several decades and is the dominant, proprietary (non-FOSS) GIS software on the market. It's used by most government agencies and universities. Since it is rather expensive to purchase for individual use, you tend to see it more often in institutional settings. If you are affiliated with a college or university, chances are you'll be able to access it somewhere on your campus. ESRI does distribute trial versions of the software for education and home use. A rival product, MapINFO created by Pitney Bowes, has a smaller but equally dedicated following. If you find that you need to learn one of these products, making the transition from FOSS is relatively straight forward as most GIS software operate under the same properties and principles and share similar user interfaces.

<--- Back     Next --->