Datasets:There are three datasets to be used in this homework:1. New York State Health Facility General Information (Download the Health_Facility_General_Information.csv for thishomework). This includes the address and geolocation of each healthcare facility:Facility ID, Facility Name, Description, Facility Address 1, Facility Address 2,Facility City, Facility State, Facility ZipCode, Facility County Code, Facility County, Facility Latitude, Facility Longitude2. Health Facility Certification Information (download the Health_Facility_Certification_Information.csv for this homework).This includes information on certifications for services and beds for each facility.Facility ID, Facility Name, Description, Attribute Type, Attribute Value, Measure Value, County3. The US zip code tabulation areas from US Census Bureau, which contains the boundary of each zip code(download tl_2019_us_zcta510.zip). Once you unzip the file, you can run the following command under db2 command lineto find metadata for the format. (Documentation of TIGER datasets can be found here.)db2se shape_info -fileName tl_2019_us_zcta510.shp
The shapefile has a multipolygon object to represent the boundary of each zip code.Tasks (10 + 3 extra credit points)1. (3 points) Setup the database.a. Enable the sample database (or your own database) for spatial support:db2se enable_db sampleb. Load the zip code area dataset using the import SQL file:db2 -tf import_zip.sqlc. Create two tables for facilities using the createfacilititytable.sql (we create two tables, cse532.facilityoriginal for originaldata, and cse532.facility with a spatial column).db2 -tf createfacilititytable.sqld. Load Health_Facility_General_Information.csv into cse532.facilityoriginal using script:db2 load from C:myfolderHealth_Facility_General_Information.csv of del MESSAGES load.msg INSERT INTOcse532.facilityoriginale. Write a SQL script facilityinsert.sql to insert data into cse532.facility by selecting data from cse532.facilityoriginaltable and converting (Latitude, Longitude) attributes into DB2GSE.ST_POINT type with srs_id 1 for geolocation attributein cse532.facility.f. Create a SQL script createfacilititycertificationtable.sql to create a table: [email protected]5/29/2020 Homework3: Spatial Queries CSE532-S20https://sites.google.com/a/stonybrook.edu/cse532-s20/homework/homework3-2 2/2cse532.facilitycertification (FacilityID, FacilityName, Description, AttributeType, AttributeValue, MeasureValue, County)and load the csv file into the table:db2 load from C:yourpathHealth_Facility_Certification_Information.csv of del MESSAGES load.msg INSERT INTOcse532.facilitycertificationg. Update the createindexes.sql to add additional indexes besides spatial indexes for the queries below.db2 -tf createindexes.sql2. (3 points) Write a query nearester.sql to find closest healthcare facility with an ER room (AttributeValue =Emergency Department) from 2799 Horseblock Road Medford, NY 11763(40.824369, -72.993983)(latitude, longitude). Please return location and distance in your result. You can use unit KILOMETER, METER, orSTATUTE MILE for distance measurement.Nearest neighbor search is not directed supported by DB2. You can use ST_BUFFER to create a buffered area (polygon/circle)from a point within a certain distance and search only stores within the buffer. Note that 0.25 degree is roughly 10 miles. Forall the datasets, we use spatial reference nad83_srs_1 with srs ID as 1.You can find information here on functions such as ST_POINT, ST_BUFFER, ST_WITHIN or ST_CONTAINS, and ST_DISTANCE.3. (3 points) Write a query noerzips.sql to find zip codes without any Emergency Department, neither in theirneighboring zip codes.4. (1 point) Drop all indexes and perform the two queries again, and compare the query performance in terms ofexecution time for above two queries.Show your time difference with and without indexes in your README file.You can refer to how to estimate query time.5. (Extra credit: 3 points). Write SQL queries or stored procedure mergezip.sql to merge zip code areas into largeones with neighboring zip code areas, so that the new population in each merged region (combined zipcodes) is largethan the current average population, using the zip code population table in Homework1. For simplicity, you can remove theduplicates from the population table.
Reviews
There are no reviews yet.