Geographically weighted regression (GWR) is a form of spatial analysis. It is a popular method which has been introduced in 1996 in the geographical literature drawing from statistical approaches used for curve fitting and smoothing application. GWR is a technique that considers non-stationary variables (e.g., climate, demographic factors, physical environment characteristics). It is used to analyze the non stationarity of variables where stationarity means the situation where descriptive statistics(mean, variance) and location dependency do not vary through space. It explores spatial data analysis and models spatial relationships and the spatial variations between response (Dependent Variables (Y)) and single or multiple explanatory variables(Independent Variables (X)). GWR is rapidly used in geography and other fields for its potential to investigate non-stationary relations in regression analysis. Regression analysis is used to model, examine, explore and regress spatial relationships. It helps to explain the spatial patterns and its factor, and this type of model is being used to predict future patterns ( also spatial patterns), and OLS is the most well-known technique for this.
However, another spatial regression technique that is rapidly being used is GWR. GWR extends ordinary least squares (OLS model) regression that explains relationships as they vary over space through analysis or models where weighted regression coefficients deviates from global coefficients.
Geographically weighted regressions can have a better result than using regressions techniques such as OLS. The ordinary least squares regressions model is commonly used for a global relationship, while GWR uses neighboring data values to analyze spatial relationships and thus predict more accurate predictions.
In the term of spatial heterogeneity, GWR assumes geographical thinking that spatial phenomenon pattern. This model does not work on overall data space. Instead of all dataspace, it used a weighted window, analyzed values, and estimated coefficient at the surrounding neighbors. Commonly, regression-based models such as OLS ignore that assumption, and hence it provides a less accurate explanation of the spatial relationship.
Model and Equation of GWR
OLS models are run to determine the global regression coefficients (β) for the independent variables and to identify the relationship between variables. In this linear regression model, y is the dependent variable, and it is a linear function of explanatory variables x1,x2,.........xp.
For the n observation, the model is given below:-
Application of GWR:
- To find the relationship between educational attainment and income distribution for the specific area.
- To find the key variables that will explain high forest fire frequency.
- Districts in which students are achieving high test scores? What characteristics seem to be associated? Where is each aspect most important?
- To analyze the relationship between malaria occurrence and environmental factors in the study area?
- What are factors that influence cancer rate consistent across the study area?
Some software packages run GWR. For example, ArcGIS, R, GWR4.0, etc.
GWR With ArcGIS Pro
This GWR (geoprocessing tool) is available in ArcGIS Enterprise 10.8.1 or later.
To run the GWR tool in ArcGIS tool, provide input features parameter(data) with both field Response variable (dependent variable) and one or more Explanatory Variable (Independent Variable). These fields must be numeric and should have maximum and minimum(Range) values. Clean the data set before implementing it, like removing missing values from both dependent or independent variables.
After that, you choose the model type based on your dataset. And it is essential to select an appropriate model for your dataset.
There are three types of GWR(Regression) models.
1. Continuous(Gaussian):-In this model, the dependent variable takes on a wide range of values.
e.g., Temperature, Total number of Sales
2. Logistic( Binary):- In this model, the dependent variable takes only two possible values as success or failure.
3. Poisson(Count):- ):- In this model, the dependent variable is discrete, representing the number of occurrences of events.
After selecting the model, choose the neighborhood (also known as bandwidth), the distance band, or neighbors used for each local regression equation. It is perhaps the most important parameter to consider for Geographically Weighted Regression, as it controls the degree of smoothing in the model.
GWR applies geographical weighting to the features used in each of the local regression equations. The Features that are farther away from the regression point are being given less weight, and thus, it has less influence on the regression results for the target feature; the closer features have more weight in the regression equation.
Now you can predict the feature(either in points or in polygon) for your study area. These feature predictions require that each of the Prediction Locations has values for each Explanatory Variable provided. A variable matching parameter is supplied if the field names from the Input Features and the Prediction Locations parameters do not match.
The GWR tool in ArcGIS produces different outputs. The statistical and GWR summaries will be available as a message at the bottom of the Geoprocessing pane.
2)-Arbia, Giuseppe (2014). A primer for spatial econometrics: with applications in R. Springer.
Brunsdon, Chris, A Stewart Fotheringham, and Martin E Charlton (1996). « Geographically
weighted regression: a method for exploring spatial nonstationarity ». Geographical analysis
28.4, pp. 281–298.
3)-Brunsdon, Chris, Stewart Fotheringham, and Martin Charlton (1998). « Geographically weighted
regression ». Journal of the Royal Statistical Society: Series D (The Statistician) 47.3, pp. 431–443.