README - NYMAG_2006_2008.mdb Frank Donnelly - Geospatial Data Librarian, Baruch College CUNY -------------------------------------------- -------------------------------------------- Newman Library, Baruch College CUNY francis.donnelly@baruch.cuny.edu May, 2010 -------------------------------------------- -------------------------------------------- The file that accompanies this document is a personal geodatabase created in ArcGIS 9.3. It contains state, county, place and PUMA level boundaries and PUMA level American Community Survey (ACS) census data for 2006-2008 for the New York-Northern New Jersey-Long Island, NY-NJ-PA Core Based Statistical Area. You can use this geodatabase with ArcGIS, versions 8.3 and later, to create your own maps and conduct your own analyses. American Community Survey (2006-2008 estimates) Data were originally downloaded from the American Factfinder at http://factfinder.census.gov/servlet/DatasetMainPageServlet?_program=ACS. The data was processed and edited, removing non-numeric characters from margin of error columns. The boundary files were downloaded from the Census Bureau's TIGER site http://www.census.gov/geo/www/tiger/. Projections were defined as NAD83. Features were converted from single part to multipart features, and some features were edited to create land areas as opposed to strict statistical or legal boundaries. If you do not have access to ArcGIS, you can still work with the data tables. Personal geodatabases are saved in Microsoft Access mdb format, and can be opened with Access. Tables can be exported out of Access into any format. -------------------------------------------- -------------------------------------------- Objects in the Geodatabase: Data Tables: There are 62 data tables in the NYMAG Geodatabase which represent a selection of tables from the 2006-2008 American Community Survey. The tables were chosen to match data compiled in the NYC Planning Department's ACS Population Profiles, and represent some of the most common census variables. Each table contains data for the 156 Public Use Microdata Areas (PUMAs) in the New York-Northern New Jersey-Long Island, NY-NJ-PA Core Based Statistical Area, aka the Greater New York City Metropolitan Area. In some instances, data for specific variables and PUMAs may be missing as the estimates were not statistically significant. A data dictionary accompanies the geodatabase and contains a list of every table and field in the database. The table numbers and field designations were created by the US Census and have not been altered. The dictionary can be used to locate specific variables within the geodatabase. For convenience, a data summary or abridged version of the dictionary was created to help users quickly locate the most common variables. Feature Classes: Geographic features are provided at two scales; there is a set of features for the entire New York Core Based Statistical Area (these features begin with the word "metro") and a subset of features just for the five boroughs of New York City (these features begin with the word "nyc"). Users can choose the scale that is appropriate for their work. At each scale, there are two versions of each feature. Features that contain the word "boundary" in their name depict the actual boundaries of the feature, which includes land and water. These boundaries are appropriate for reference purposes. Features that do NOT contain the word "boundary" in their names depict actual land areas with water removed. These features are appropriate for thematic mapping. The PUMA features can be used to map the data in the database. The PUMA geographic features are uniquely identified by the "PUMA5ID00" field in their attribute tables. This field can be joined in ArcGIS or a relational database to the "GEO_ID2" field in the data tables, as these fields contain the same ID number - a seven digit FIPS code that indicates the state and PUMA numbers. The PUMA polygon features can be used to map the estimates as shaded areas (i.e. choropleth maps) while the PUMA point features can be used to create spatial footnotes that indicate the Margin of Error for the estimate being mapped. In addition to the PUMA features, additional geographic features such as states, counties, places, and a metropolitan area boundary are included for reference purposes when creating maps. -------------------------------------------- -------------------------------------------- ERRATA: (1) Table 24010 is split into three tables due to column limit restrictions in MS Access. The user may join the tables in ArcCatalog if this data is needed. (2) PUMA5 02300 in Connecticut consists of two separate areas that are split by PUMA5 02500. The PUMA point layer, metro_area_pumas_points, contains no centroid for the western portion of this split PUMA. (3) PUMA5 00600 in PA, and PUMA5 003101 in NY extend beyond the NY metro statistical area boundary, with part of their area inside the metro and part of their area outside. A decision was made to include rather than exclude them. Since these areas are quite rural, including them has a small impact on the overall population of the metro area. (4) Table 23001 is split into two tables due to column limit restrictions in MS Access. The user may join the tables in ArcCatalog if this data is needed. -------------------------------------------- -------------------------------------------- Changes to the 2006-2008 Geodatabase (compared to the 2005-2007 edition): (1) Greenspace and facilities features for NYC have been added to enhance thematic mapping at the NYC scale. These features are a selection from the Census TIGER file's landmark feature set and represent the largest or most prominent landmarks in the city. Overlay these features on top of PUMA-based thematic maps to give map readers a better sense of the distribution of the phenomena you are mapping. (2) Neighborhood names have been added to all NYC PUMA features in order to correlate PUMA numbers with recognizable place names. The neighborhood definitions were developed by the US census Bureau and NYC Department of City Planning in the 2005 NYC Housing and Vacancy Survey (in the survey they are referred to as sub-boroughs). The names are also stored in a table called nyc_puma_nbhoods. (3) NYC county features were edited to remove several blank fields. (4) The NYC PUMA point feature was edited to include a missing centroid for PUMA 04018 in Brooklyn. (5) The following tables were added to the database: B11006 (households with presence of persons 60 years and older by type) and B11009 (unmarried partner households). (6) The following tables were dropped from the database: B10050 (grandparents living with grandchildren), B11007 (households with presence of persons 65 years and older by size and type), B17006 (poverty status of related children by family type and age), and B23008 (age of own children in families and subfamilies by living arrangements by employment status). (7) The following table was omitted because it was not included in the 2006-2008 ACS: B18002 (sex by age by disability status for civilian population). (8) Because of column restrictions in MS Access, B23001 has been split into two tables, both of which are included in this database (the second table was accidentally omitted from the 2005-2007 geodatabase).