One of the first steps in spatial analysis is to create a neighborhood matrix, that is to say create a relationship/connection between each and (ideally!) every polygon. Why? Well, given that the premise for spatial analysis is that neighboring locations are more similar than far away locations, we need to define what is “near”, a set of neighbors for each location capturing such dependence.

There are many ways to define neighbors, and usually, they are not interchangeable, meaning that one neighborhood definition will capture spatial autocorrelation differently from another.

In R the package **spdep** allows to create a neighbor matrix according to a wide range of definitions: contiguity, radial distance, graph based, and triangulation (and more). There are 3 main and most used neighbors: **1) Contiguity based of order 1 or higher, 2) Distance based, and 3) Graph based.**

Install and load the **maptools** and **spdep** libraries shapefile from North Carolina counties:

`>library(maptools)`

>library(spdep)

>NC<- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1], IDvar="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))

1. Contiguity based relations are the most used in the presence of **irregular polygons** with varying shape and surface, since contiguity **ignores distance** and focuses instead on the location of an area. The function **poly2nb** allows to create 2 types of contiguity based relations:

- First Order Queen Contiguity defines a neighbor when at least one point on the boundary of one polygon is shared with at least one point of its neighbor (common border or corner);

`>`

`nb.FOQ <- poly2nb(NC, queen=TRUE, row.names=NC$FIPSNO) #row.names refers to the unique names of each polygon`

Calling`nb.FOQ`

you get a summary of the neighbor matrix, including the total number of areas/counties, and average number of links:

Neighbour list object:

Number of regions: 100

Number of nonzero links: 490

Percentage nonzero weights: 4.9

Average number of links: 4.9 - First Order Rook Contiguity does not include corners, only borders, thus comprising only polygons sharing more than one boundary point;

`>nb.RK <- poly2nb(NC, queen=F,row.names=NC$FIPSNO)`

NB: if there is a region without any link, there will be a message like this:

> nb.RK

Neighbour list object:

Number of regions: 100

Number of nonzero links: 462

Percentage nonzero weights: 4.62

Average number of links: 4.62`Neighbour list object:`

Number of regions: 910

Number of nonzero links: 4906

Percentage nonzero weights: 0.5924405

Average number of links: 5.391209

10 regions with no links:

1014 3507 3801 8245 9018 10037 22125 30005 390299 390399

where you can identify the regions with no links (1014, 3507,…), and in R it is possible to manually connect them or change the neighbor matrix so that they can be included in the neighbor matrix (such as graph based neighbors).

- Higher order neighbors are useful when looking at the effect of lags on spatial autocorrelation and in spatial autoregressive models like SAR with a more global spatial autocorrelation:

`>nb.FOQ <- poly2nb(NC, queen=TRUE, row.names=NC$FIPSNO) #first define the first order queen to get to further lags`

# Second Order Queen

>nb.SOQ <- nblag(nb.FOQ,2) # 2 is the lag, if you want 6th order neighbors you'd have nblag(nb,6)

>nb.RK <- poly2nb(NC, queen=F,row.names=NC$FIPSNO) #same here

# Second Order Rook

>nb.SRC <- nblag(nb.RK,2)

2. Distance based neighbors defines a set of connections between polygons either based on a (1) defined Euclidean distance between centroids `dnearneigh`

or a certain (2) number of neighbors `knn2nb`

(e.g. 5 nearest neighbors);

>coordNC <- coordinates(NC) #get centroids coordinates

d05m <- dnearneigh(coordNC, 0.5) #define the distance (here 1/2 mile)

>nb.5NN <- knn2nb(knearneigh(coordNC,k=5),row.names=NC$FIPSNO) #set the number of neighbors (here 5)

a little trick: if you want information on neighbor distances whatever the type of neighborhood may be:

>`distance <- unlist(nbdists(nb.5NN, coordNC))`

>summary(distance)

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.1197 0.3323 0.3956 0.4095 0.4716 0.9327

3. Graph based

- Delauney triangulation
**tri2nb**constructs neighbors through Voronoi triangles such that each centroid is a triangle node. As a consequence, DT ensures that every polygon has a neighbor, even in presence of islands. The “problem” with this specification is that it treats our area of study as if it were an island itself, without any neighbors (as if North Carolina were an island with no Virginia or South Carolina)… Therefore, distant points that would not be neighbors (such as Cherokee and Brunswick counties) become such; - Gabriel Graph
**gabrielneigh**is a particular case of the DT, where**a**and**b**are two neighboring points/centroids if in the circles passing by**a**and**b**with diameter**a****b**does not lie any other point/centroid; - Sphere of Influence
**soi.graph**: twopoints a and b are SOI neighbors if the circles centered on**a**and**b**, of radius equal to the**a**and**b**nearest neighbour distances, intersect twice. It is a sort of Delauney triangulation without the longest connections; - Relative Neighbors
**relativeneigh**is a particular case of GG. A border belongs to RN if the intersection formed by the two circles centered in**a**and**b**with radius**ab**does not contain any other point.

`>IDs <- row.names(as(NC, "data.frame")) #create a vector with the names of each polygon NC$FIPSNO`

>delTrinb <- tri2nb(coordNC, row.names = IDs) #Delauney triangulation

>summary(distance)

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.1197 0.3473 0.4154 0.5673 0.5187 5.5830

>SOInb <- graph2nb(soi.graph(delTrinb, coordNC), row.names = IDs) #Sphere of influence

>summary(distance)

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.1197 0.3257 0.3919 0.3958 0.4629 0.6460

>GGnb <- graph2nb(gabrielneigh(coordNC), row.names = IDs) #Gabriel graph

>summary(distance)

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.1197 0.3191 0.3715 0.3813 0.4364 0.6777

>RNnb <- graph2nb(relativeneigh(coordNC), row.names = IDs) #Relative neighbor (or relative graph)

>summary(distance)

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.1197 0.2984 0.3464 0.3382 0.3815 0.5187