shaunaewald.files.wordpress.com€¦ · Web view2018. 2. 2. · So both the z-score and the p-score indicate that the data is significantly clustered. Part 8: Clustering of Values

Shauna EwaldSeptember 27, 2017

GIS4850Lab 4

Spatial Statistics and Point Pattern Analysis

Part 1: Displaying Crime Locations (1.0 points)

Lab4.mxd with 4 feature classes: crime, police_stations, police_districts, police_precincts

1. How many calls did District 5 answer? (.5 points)

Crime attribute table, Select by Attribute where DISTRICT_ID = '5'778 out of 7506

2. How many of the calls fall within the region of District 5? (.5 points)

Zooming in some crime points fell within district 5 but have not been included in DISTRICT_ID = '5'

Police_districts attribute table, select DIST_NUM = 5Select Layer By Location: crime WITHIN police_districtsCrime attribute table 781 put of 7506

Part 2: Identifying Mean Centers (1.0 points)

Create new geodatabase in ArcCatalog: Lab4Results In ArcMap (File > Map Document Properties) direct new location for output

(The simplest measurement of a geographic distribution is the mean center. This is calculated by finding the average of the x-coordinate values of all the features, then finding the average of all the y-coordinate values. The resulting x,y pair is the mean center.)

Mean Center tool Input: crime Output: All_Mean_CenterMean Center tool Input: crime Output: Casewise_Mean_Center Case Field: DISTRICT_ID

3. Make a map. (1.0 points)

Part 3: Central Feature (1.0 points)

(To find the central feature of a geographic distribution, we find which feature has the lowest total distance to all of the other features) Let’s find the central police station.

Central Feature toolInput: police_stations Output: Cental_Station Distance Method: Manhattan

4. Which feature (station number) is the central station for Denver police stations? Make a map of the Central Feature. (1.0 points) Station 8 at 1566 N Washington St

Part 4: Weighted Mean Centers. (2.0 points)

The crime layer has a field called Is_Crime. This field has values from 0 to 1, where 1 is a crime and 0 is a traffic incident. Since crime calls are more important, we can implement a higher weight to them. Find the weighted mean center of the crime layer.

Mean Center toolInput: crime Output: Weighted_Mean_Centers Weight Field: Is_Crime

5. Make a map showing the Weighted Mean Centers and the original Mean Centers for crimes. (1.0 pt)

6. Based on the distance between the Mean Center for all crime and the Weighted Mean Center, evaluate the spatial distributions of crime-only locations vs. crime-and-traffic locations. Do you think the distributions are similar, or different? How do you know? (1.0 Points)

(Is_Crime = 0) 1840 out of 7506 meaning that there were 1840 traffic accidents and 5666 crime incidents – a significant amount more than traffic accidents. However, the proximity of the two mean centers implies that the distributions are similar. There may be slightly more crime trending west since that’s the direction of the weighted mean center.

Part 5: Standard Distance (1.0 points) (Measuring standard distance lets you quantify the amount of dispersion in a set of features. It is calculated by determining the average distance each feature is from the mean center, then determining a value for how much the distances deviate from the average distance. The results are displayed as a circular ring indicating one, two, or three standard deviations.)

Standard Distance toolInput: crime Output: Crime_Std_Dist Circle Size: 1_standard_deviation

7. Make a map showing the Standard Distance and Mean Center layers. (0.8 points)

8. About what percentage of crimes occur within the Standard Distance circle? HINT: 1 standard distance represents 1 standard deviation of the distribution. (0.2 points) One standard deviation is about 63% of the data, so 63% of crime occurs within the standard distance circle.

Part 6: Standard Deviational Ellipse (1.0 points)The distribution of features may have a directional trend. By calculating the variances for both the x and y directions independently, the directional trend can be shown using an ellipse instead of a circle. For example, the freeways in Denver may cause a directional trend in calls. By calculating the standard deviational ellipse, you can visualize any effects that the roads may be having.

Directional Distribution toolInput: crime Output: Crime_Ellipse Ellipse Size: 1_standard_deviation

9. Make a map showing the Mean Center and the Crime Ellipse layers. (0.5 points)

Directional Distribution toolInput: crime Output: Crime_Ellipse Case Field: DISTRICT_ID Ellipse Size: 1_standard_deviation

10. Make a map showing the Mean Center, Crime Ellipse and the Case Crime Ellipse layers. (0.5 points)

Part 7: Nearest Neighbor Analysis (.5 points)

The Average Nearest Neighbor tool can be used to determine if a set of features shows a statistically significant clustering or dispersion. It does this by measuring the distance from each feature to its

nearest neighbor feature. An index value is returned. If the value is less than one, then the data is considered clustered. If the value is greater than one, then the data is considered disperse.

For Nearest Neighbor to work, we must have the size of the study area. Open Attribute Table for police_districts. Right-click on the Shape_Area field and run Statistics. Highlight and Copy the Sum. We’ll use that in our analysis. Sum: 4304588145.63032

Average Nearest Neighbor toolInput: crime Distance Method: Euclidean Area: 4304588145.63032

11. What is the Nearest Neighbor Ratio value? 0.46 Is it clustered or disperse? If the index is less than 1, the pattern exhibits clustering. Is the result significant or not? (0.5 points) The z-score is the number of standard devations from random. The z-score here is very far from random. And according to desktop.arcgis.com, “the p-value is the probablity that the observed spatial pattern was created by some random process. When the p-value is very small, it means it is very unlikely (small probability) that the observed spatial pattern is the result of random processes.” So both the z-score and the p-score indicate that the data is significantly clustered.

Part 8: Clustering of Values (.5 points)The Average Nearest Neighbor determines whether features are clustered or disperse, but it doesn’t consider whether similar values are clustered or disperse. This tool reports if low-values are clustered together, high-values are clustered together, or the pattern is random.

Use the OFFENSE_CODE field as a value. In general, lower values are more severe crimes than higher values. So if there is low clustering, that is an area with the worst types of crimes.

The type for OFFENSE_CODE is a String in our data. This happens a lot when working with data. How can we turn it into a numeric type we can use? In attribute table Add field, Name: Offense_Code_1 and choose Data Type: Double, Field Calculator: Offense_Code_1 = [OFFENSE_CODE]

High/Low Clustering (Getis-Ord General G) toolInput Feature Class: crime Input Field: Offense_Code_1 Conceptualization of Spatial Relationships: INVERSE_DISTANCE Distance Method: Euclidean

12. What is the Observed General G value? 0.000392 Is the result significant? Yes Is there clustering and if so, what kind? What does this mean for crimes in Denver? (.5 points) If the G is high, then high values cluster together (smaller crime). However, the G is low, so more severe crimes are clustered.

Part 9: Cluster and Outlier Analysis (1.0 points) The Cluster and Outlier Analysis tool will take a set of features with a weight or attribute value, perform pattern analysis, and display the results to highlight clustering.

Cluster and Outlier Analysis (Anselin Local Moran’s I) toolInput Feature Class: crime Input Field: Offense_Code_1 Output: Crime_Clusters

13. Make a map showing the output from Cluster and Outlier Analysis. (0.8 points)

14. What do the blue "Low Outlier" features mean? Are similar values or dissimilar values at these locations? (0.2 points) The COType field for blue “low outlier” features is classified as LH. That means the feature has a low value and is surrounded by features with high values. Low values are features with more severe crime. So there is a severe crime location surrounded by locations of smaller crime.

Part 10: Hot-Spot Analysis (1.0 points) Hot-Spot Analysis uses the Getis-Ord Gi statistic to find hot spots of similar values. Local Moran's I only tells you if clustering is occurring, while Hot-Spot Analysis tells you whether high values or low values are near each other.Hot Spot Analysis (Getis-Ord Gi *) toolInput Feature Class: crime Input Field: Offense_Code_1 Output: Hot_SpotsConceptualization of Spatial Relationships should be FIXED_DISTANCE_BAND

15. Make a map showing both the Hot Spot results. Do you see a pattern? (.5 points)

Remember, in general, lower values are more severe crimes than higher values.

The Gi* statistic returned for each feature in the dataset is a z-score. For statistically significant positive z-scores, the larger the z-score is, the more intense the clustering of high values (hot spot – red – smaller crimes). For statistically significant negative z-scores, the smaller the z-score is, the more intense the clustering of low values (cold spot – purple – more severe crimes).

There is one large area of purple, hot spot features that represent severe crimes while there are three seperated areas of red, cold spot features that represent small crimes. More police presence, if not more police precincts, should be directed in the purple, hot spot area.

16. Answer the following questions: (.5 points) a. How long did the lab assignment take you to finish? 3-4 hoursb. Were there any errors in the assignment that made it difficult for you to finish? No errors. I read and re-read the difinitons z-score, p-score, and the null hypothesis.c. Where did you get help if you needed it? Class notes, ArcGIS Help

Documents

shaunaewald.files.wordpress.com€¦ · Web view2018. 2. 2. · So both the z-score and the p-score indicate that the data is significantly clustered. Part 8: Clustering of Values