Getis-Ord Gi (also known as the Getis-Ord General G statistic) is a statistical method used to identify spatial clusters of high or low values in a spatial dataset. The method was developed by Arthur Getis and J. K. Ord in 1992.
The Gi statistic measures the degree of spatial autocorrelation of a variable in a set of neighboring locations. Spatial autocorrelation refers to the extent to which similar values tend to cluster together in space. The Gi statistic is calculated for each location in the dataset and can be used to identify clusters of high or low values and outliers.
The calculation of the Gi statistic involves three steps:
Calculate the local sum for each location. This involves adding up the variable values for the location and its neighboring locations.
Calculate the global sum and mean for the entire dataset. This involves adding up the variable values for all locations in the dataset and dividing by the total number of locations.
Calculate the standard deviation for the entire dataset.
The Gi statistic for each location is then calculated as follows:
Gi = (Xi - Xbar) / S * Σj(wij * Xj - Xbar)
where Xi is the value of the variable at location i, Xbar is the mean of the variable for the entire dataset, S is the standard deviation for the entire dataset, wij is a spatial weight that measures the distance between location i and j, and Xj is the value of the variable at location j.
A positive Gi value indicates that the location has a high value relative to its neighbors, while a negative Gi value indicates that the location has a low value relative to its neighbors. The magnitude of the Gi value indicates the strength of the spatial clustering.
The Gi statistic can be visualized using a map, with locations colored based on their Gi values. This can help identify the dataset's spatial clusters of high or low values. The Gi statistic is commonly used in geography, epidemiology, and environmental science to analyze spatial patterns in data.
Getis-Ord Gi* (pronounced "Getis-Ord G-star") is an extension of the Getis-Ord Gi statistic, which is used to identify statistically significant hotspots and coldspots in a spatial dataset. The method was developed by Arthur Getis and J. K. Ord in 1996 to improve the original Gi statistic.
The Gi* statistic is calculated using a similar formula to the Gi statistic but with an additional term that considers the spatial autocorrelation of the data at different distances. The formula for the Gi* statistic is:
Gi* = (Xi - Xbar) / S * Σj(wij * Xj - Xbar) / √(Σj(wij))^2 / N
where N is the total number of locations in the dataset.
The numerator of the Gi* formula is the same as the Gi formula. At the same time, the denominator represents a measure of the expected value of the sum of the weights for each location. The denominator considers the spatial autocorrelation of the data at different distances and is used to standardize the numerator.
The Gi* statistic produces a z-score, which can be used to determine the statistical significance of a hotspot or coldspot. A positive z-score indicates a statistically significant hotspot (i.e., a location with a high value surrounded by locations with high values), while a negative z-score indicates a statistically significant coldspot (i.e., a location with a low value surrounded by locations with low values).
The significance of the z-score can be determined using a p-value or a critical value. A p-value represents the probability of obtaining a z-score as extreme as the observed value, assuming that the null hypothesis (i.e., no spatial clustering) is true. A critical value represents the threshold above which the z-score is considered statistically significant.
The Gi* statistic can be used to identify hotspots and coldspots in various spatial datasets, such as crime data, disease incidence data, and environmental data. The method is particularly useful for identifying spatial patterns that may be missed by other methods and for generating hypotheses about the underlying causes of spatial clustering.
|