The goal is to view large geometric data over a thinwire. Specifically, we want to (I) process, (II) simplify and (III) display the TIGER Map data for 16 counties in the Metropolitan NYC area. Call these Tasks I, II and III, respectively. It is important to understand that we view the abstract Map as an overlay of three layers: Polygonal Subdivision (Sigma), Transportation Network (Gamma) and Point Landmarks (Lambda).
OVERVIEW OF TASKS
Here is a brief description of the main challenges in each task:
-- Task I requires Postgresql and JDBC interfaces to read and join TIGER data files. Some issues here are (a) converting a fixed-field format of TIGER files into the more flexible (array type) fields of Postgress, (b) merging data from more than one county.
-- Task II requires understanding the data to create effective simplifications. You will need to figure out the nesting structure of the polygons, and determine effective rules for simplification.
-- Task III calls for a Client program that uses JDBC to access the database. It requires an effective use of (a) image buffering, (b) hashing to avoid redundant bandwidth usage, (c) design for responsiveness. It is important to note that we put all the ``smartness'' into your client program as there is no Server Program except for the Postgresql server.
PLEASE NOTE THAT THE INTERFACE SPECS IS SUBJECT TO CHANGE. IF YOU WANT SOMETHING CHANGED HERE, PLEASE LET ME KNOW.
Tlines (Tlid INT, StartPt POINT, EndPt POINT, Detail PATH, -- type "INT[]" is also OK BBox BOX, -- bounding box CFCC CHAR(3), Name CHAR(30), Side CHAR(1), LPolyId CHAR(15), RPolyId CHAR(15),The CFCC, Name, Side information are all found in RT1. Side is either 1 or 2, identifying boundary lines. [NOTE: RT1, the 2-sided Tlines are given a NULL value, and 1-sided Tlines are given the value 1.] Basically, CFCC and Name provide the "merging information" for tiger lines. The Detail field includes (redundantly) the StartPt and EndPoint, and represents a PATH. However, we noted that you can also represent it by an array of INT's. The reason is that this may be more efficient, we do not need the PATH datatype to support fast range query (this function is already provided by the BBox field).
The LPolyId and RPolyId are the polygons to the left and right of the Tline. Note that these are given by a character string of length 15. The original Polygon id's are not unique across counties: we make it unique by concatenating the Census File ID (CenId) information. The original PolyId (before concatenation) and the CenId information are found in Record Type A files. These two files are 10 and 5 characters, respectively. They are essentially integers (except CenId is always the letter "C" followed by 4 digits). So we could combine them and view it as a 14 digit integer. However, for the sake of convenience for Group III, we will simply view it as a 15 character string. [Compared to integers, this is less efficient in comparisons, and wastes space. Perhaps storing as a 14 digit DOUBLE is best]
Here is the second table:
TPolys (PolyId CHAR(15), -- combined with Census ID BoundaryList ARRAY[int], -- unordered list of Tlines BBox BOX, -- bounding box for Polygon Place90 INT, -- FIPS 55 Code. Watermark CHAR(1), Name CHAR(60)) -- using "TEXT" type is also OKThe TPolys table has an entry for each original polygon in the Tiger data. BoundaryList is a list of all the Tlid's that bound the current polygon. This list is in arbitrary order -- Task II will sort this out later. Note that we have a BBox to facilitate range queries on TPolys. Place90 is the information from RTA and helps identify the type of this polygon -- it is the same as the Fips field in RTC. Name is a field from RTC, and again helps in merging and coloring area landmarks (color pink). To get this information, you can join RTA and RTC:
=> SELECT Place90, Name FROM RTA, RTC -> WHERE RTA.Place90 = RTC.Fips and RTC.Fips>0;Note that the "Fips>0" clause assumes that when the Fips field is a NULL in RTC, we replace it by -1. Watermark will identify this polygon as water or not: this information can be found in RTS.
Here is the third table:
Landmarks (LandId INT, -- landmark identifier number CFCC CHAR(3), -- CFCC code Name CHAR(30), -- name Position POINT -- lon/lat )This Landmarks table is derived from Record Type 7. However, we should make sure that only point landmarks are kept in this list (this is indicated by the fact that the Lon/Lat values in RT7 file has NULL values).
TPolyX (PolyId INT, OuterLoop ARRAY[INT], BBox BOX, -- bounding box Parent INT, Color INT, -- white, blue, green, yellow, pink Name TEXT ) TLineX (Tlid INT, StartPt POINT, EndPt POINT, Detail PATH, -- could be ARRAY[INT] if you like BBox BOX, -- bounding box CFCC CHAR(3), Border CHAR(1), Name TEXT, Usage CHAR(1), LPolyId INT, RPolyId INT, )NOTES:
NY: 36005 Bronx 36047 Kings (Brooklyn) 36061 New York (Manhattan) 36081 Queens 36085 Richmond (Staten Island) NJ: 34003 Bergen (Northern NJ) 34017 Hudson (main connection from Manhattan to NJ) 34039 Union (connected to Staten Island)The remaining 8 counties are
36059 Nassau (in Long Island) 36103 Suffolk (in LongIsland) 36087 Rockland (Upstate NY) 36071 Orange (Upstate NY) 36119 Westchester (Upstate NY) 34013 Essex (NJ, west of Hudson) 34023 Middlesex (Southern NJ) 34025 Monmouth (Southern NJ)
Prince William 51153 Manassas 51683 Manassas Park 51685
1) Use indexes. E.g., for Tlines, you probably should create an index on Tlid. 2) To process the boundary lines for each group of county, you probably want to keep them in a separate temporary table at first -- the number of these boundary lines are MUCH smaller than the total number of lines.
GROUP I Targets: > make -- compile all the programs > make tlines -- create the Tlines table from scratch > make tpolys -- create the Tpolys table from scratch > make landmarks -- create the Landmarks table from scratch GROUP II Targets: > make -- compile all the programs > make N -- where N = 0, 1 or 2. It should create the Level N tables from scratch GROUP III Targets: > make -- compile all the programs > make md -- start the MapDisplay programYou can insert other targets as you find useful. Put verbal instructions as comments in the Makefile. Remember that each target can have several equivalent names (e.g., "make one" could be equivalent to "make TigerLoader").