**Figure 2**: Frequency of categories of street crime reported in 2012 and 2014.

In order to assess the spatial relationship between the distribution of cellular radios and the change in street crime between 2012 and 2014, a 660 by 1060 km grid is defined over the UK with 5 x 5 km grid-cells. The coordinate reference system (CRS) of the data is transformed from EPSG: 4326 (WGS 84) to EPSG: 27700 (British National Grid) to reduce spatial distortions during the binning process. After all the data points are binned within the grid, the total number of radios and crimes are counted in each grid-cell. A subtraction is then performed to obtain the difference in street crimes per grid-cell between 2012 and 2014. The result of the process is a set of spatially defined grid-cells, each containing the total number of radios, and the change in crimes between 2012 and 2014.

The number of radios, and the total change in crimes per grid-cell are plotted on maps of the UK (Figure 3). The data in the figure are scaled to improve the interpretability and the appearance of the graphs. As expected, cellular radios tend to be concentrated around cities and areas with large populations. The change in street crime across the UK is very subtle, it is difficult to make any conclusive assessment about the relationship between street crime and cellular coverage from a visual inspection of the maps alone.

**Figure 3**: The map on the left shows the density of radios; red colors indicate denser cellular coverage. The map on the right shows the change in street crime, red indicates an increase in street crime, while blue indicates a decrease. The x- and y-axes on both maps correspond to coordinates of the British National Grid.

Scatter plots are used to express the relationship between the change in Comparative Crimes and the total number of radios per grid-cell (Figure 4). A negative trend is observed, along with a few outliers. The abnormality of each data point in the dataset is quantitatively measured with an Isolation Forest algorithm ― a non-parametric technique is chosen in order to avoid making assumptions about the statistical distribution of the data. In an Isolation Forest, scores are assigned to each data point based on their proximity to the root of the tree; the most abnormal data points are found to exist closer to the root of the tree, while more normal data points are found at deeper ends of the tree. Therefore, each data point can be ranked by its isolation score, lower scores correspond to more abnormal data (Figure 5).

**Figure 4**: Scatter plots are used to illustrate the change in Comparative Crime vs the number of radios per grid-cell. In the scatter plot on the left, the two most abnormal points can be identified (highlighted in red-orange). These two points represent grid-cells located near the center of London. The scatter plot on the right is the same as the one on the left, zoomed-in and excluding the two points.

**Figure 5**: An Isolation Forest is used to quantitatively rank each data point in terms of abnormality. The display on the left is a contour plot of the isolation scores assigned to each data point, the axes are the same as those used in the left most scatter plot in Figure 4. The histogram on the right shows the frequency of occurrence of isolation scores assigned in the dataset. Lower isolation scores correspond to more abnormal data.

The two most abnormal points are colored reddish-orange in Figure 4 for visual identification; these two points could be considered high-leverage points due to their substantial distance from the majority of other points along the trend. Both of the points display a strong negative change in Comparative Crimes, and a radio count far above average. These two points represent grid-cells located near the center of London, within 5 km of Big Ben. One point corresponds to a grid-cell near Park Road, next to The Boating Lake in The Regent’s Park, and the other corresponds to a grid-cell close to the intersection of Bath Street and Old Street in London. It could be argued that these two points are not erroneous, and that they contain relevant and interesting information ― both occur near a major city center, and both occur along the negative trend defined by the rest of the data. Nonetheless, the two points are removed from the dataset to avoid distortions while modeling.

Again, it is difficult to make definitive statements about the relationship between street crime and cellular coverage based on a visual inspection of the scatter plots. Therefore, three statistical tests of hypothesis are performed using bootstrap techniques to determine if there is a statistically significant difference in the mean change in street crimes between grid-cells that have an above average, and a below average number of radios. In other words: Is the number of radios in a given grid-cell a statistically significant factor in determining the change in crime observed from 2012 to 2014?

In order to perform the hypothesis test, the grid-cells are split into two groups: Strong Cellular Coverage and Weak Cellular Coverage. The two groups are defined using the mean number of radios per grid-cell.

**Strong Cellular Coverage (SCC)**: Grid-cells with a large number of radios are defined as those with a radio count above the mean. These grid-cells are assumed to have strong cellular coverage for the test.**Weak Cellular Coverage (WCC)**: Grid-cells with a small number of radios are defined as those with a radio count equal to or below the mean. These grid-cells are assumed to have weak cellular coverage for the test.

In the first test (i.e., Test 1), the mean change in Comparative Crimes is noted in both SCC and WCC groups; the difference between these two means is then obtained. The null hypothesis is that there is no significant difference in the mean change in Comparative Crimes between the SCC and WCC groups, and the alternative hypothesis is that there is a significant difference. The chosen alpha level for the test is 0.05.

The second and third tests are performed similarly to Test 1 ― the first test assesses the difference in the mean change in Comparative Crimes between the SCC and WCC groups, and the second test assesses the difference in the mean change of *all* *categories* of street crimes between the SCC and WCC groups. Finally, the third test narrows the analysis down to just one category of street crime, the difference in the mean change in Anti-Social Behaviour between the two groups.

P-values of 0.00, 0.33, and 0.00 are obtained in Test 1, Test 2, and Test 3, respectively (Figures 6 to 8). The results of Test 1 suggest that there is a statistically significant difference in the mean change in Comparative Crimes between grid-cells that have an above average, and a below average number of radios as defined in the SCC and WCC groups. However, the results of Test 2 suggest that there is no statistically significant difference in the mean change in total street crimes between the two groups.

**Figure 6**: Test 1, a hypothesis test is performed to determine whether there is a statistically significant difference in the mean change in Comparative Crimes between grid-cells that have an above average, and a below average number of radios.

**Figure 7**: Test 2, a hypothesis test is performed to determine whether there is a statistically significant difference in the mean change in total street crimes between grid-cells that have an above average, and a below average number of radios.

**Figure 8**: Test 3, a hypothesis test is performed to determine whether there is a statistically significant difference in the mean change in Anti-Social Behaviour crimes between grid-cells that have an above average, and a below average number of radios.

Five possible reasons for a difference in results between Test 1 and Test 2 are: (1) The results of Test 1 were influenced by crime category selection bias, not all the crime categories were included in Test 1, only the Comparative Crimes. (2) The crime categorization scheme used by the UK Police changed between 2012 and 2014, and may have resulted in a change in the way crimes are reported between the two years, thus introducing bias. (3) There were 14 crime categories reported in 2014, and only 11 categories reported in 2012. Perhaps a more detailed categorization scheme in 2014 led to a larger number of reported crimes than would have otherwise been recorded using 2012’s categorization scheme. (4) Removal of the two high-leverage points mentioned in the previous step influenced the results. In order to determine the affect that the removal of these two points had on Test 2, the test was rerun with both points included. The rerun resulted in a slightly lower p-value of 0.16, however, the p-value was not lower than the chosen alpha level of 0.05, and therefore did not change the overall conclusion. And lastly, (5) change in certain categories of street crime are more strongly correlated to the number of radios than others.

The results of Test 3 suggest that there is a statistically significant difference in the mean change in Anti-Social Behaviour crimes between grid-cells that have an above average, and a below average number of radios. This suggests that the results depend on the categories of crime that are tested on. Certain street crimes may be more or less correlated with cellular coverage than others.

**Statistical Modeling**

Five machine learning algorithms, including a Logistic Regression classifier (LRC), Support Vector Machines classifier (SVC), K-Neighbors classifier (KNN), Gradient Boosting classifier (GBC), and Gaussian Naive Bayes classifier (NBC) are optimized, trained, and tested to perform binary classification. The purpose of the classification is to predict whether or not a given grid-cell has more (or less) than the mean number of radios, based on the observed change in street crime.

In order to fit the models for prediction, a classification scheme is defined in the dataset (Figure 9). The classification scheme is based on the number of LTE radios per grid-cell, since the difference in crimes we are testing on occurred during the roll-out of LTE in the UK (i.e., 2012–2014). Grid-cells are classified into the following two groups based on the mean number of LTE radios:

**Category 0**: Grid-cells that contain the mean or more radios are classified as 0.**Category 1**: Grid-cells that contain less than the mean are classified as 1, these grid-cells might benefit from more radios.

**Figure 9**: A binary classification scheme used to label grid-cells according to the number of radios they contain. Category 0 grid-cells contain the mean number or more radios. Category 1 grid-cells contain less than the mean number of radios. Category 1 is the target category.

Features for prediction are selected among the seven categories of Comparative Crimes, ANOVA F-values are computed to assess the significance of each feature according to the classification scheme (Figure 10). The five crime categories with the strongest scores are: Anti-Social Behaviour, Criminal Damage and Arson, Other Theft, Burglary, and Vehicle Crime. These five categories are chosen as input features for prediction, and are referred to as “Predictor Crimes” in the current study.

**Figure 10**: F-values are used for feature ranking to determine which categories of street crimes are most discriminating (i.e., best predictor candidates) according to the classification scheme. The x-axis labels correspond to the seven Comparative Crimes: Anti-Social Behaviour (AS), Criminal Damage and Arson (CDA), Other Theft (OT), Burglary (B), Vehicle Crime (VC), Drugs (D), and Shoplifting (S).

The five machine learning algorithms are optimized using a grid search cross validation technique across a variety of parameters. Before optimization, the dataset is split into train and test datasets ― 80% for training and 20% for testing. The test dataset is set aside for model evaluation. The models are fit to the train dataset using the best parameters identified in the grid search cross validation. Precision, recall, and F1 score metrics are calculated for each model using the test dataset. Receiver Operating Characteristic (ROC) and Precision-Recall curves are graphed to evaluate model performance (Figures 11 to 15). According to the metrics, NBC performed the best in terms of overall predictive performance, slightly better than GBC and SVC.

In determining where to install cellular infrastructure, it’s important to avoid miss-labeling Category 0 grid-cells as Category 1 ― false positives are more costly than false negatives when identifying target locations to install cellular radios. Miss-labeling a Category 0 as a Category 1 grid-cell might result in miss allocation of resources, therefore the model must have high precision (i.e., positive predictive value) in identifying Category 1 grid-cells. In this regard, NBC and SVC performed equally well, both hold a Category 1 precision of 92%. In terms of recall (i.e., true positive rate), NBC was able to identify 97% of the actual Category 1 positives, as positive, while SVC was able to identify 96%. Category 0 was harder to identify, NBC was able to identify 72% of the actual Category 0 positives, as positive, while SVC was able to identify 71%.

**Figure 11**: Receiver Operating Characteristic (ROC) and Precision-Recall curves for the Logistic Regression classifier (LRC).

**Figure 12**: Receiver Operating Characteristic (ROC) and Precision-Recall curves for the Support Vector Machines classifier (SVC).

**Figure 13**: Receiver Operating Characteristic (ROC) and Precision-Recall curves for the K-Neighbors classifier (KNN).

**Figure 14**: Receiver Operating Characteristic (ROC) and Precision-Recall curves for the Gradient Boosting classifier (GBC).

**Figure 15**: Receiver Operating Characteristic (ROC) and Precision-Recall curves for the Gaussian Naive Bayes classifier (NBC).

Principal Component Analysis (PCA) is used to reduce the dimensionality of the five Predictor Crimes down to two, in order to graphically illustrate the classification decision boundaries defined by each of the statistical models in two dimensions (Figures 16 to 20). Red areas in the figures correspond to Category 1 predictions, while blue areas correspond to Category 0. The red and blue colored points correspond to the actual labels of the Principal Components (PCs). According to both the graphical and numerical metrics, the Naive Bayes approach is the recommended method for predicting Category 1 and Category 0 grid-cells, based on the observed change in Predictor Crimes.

**Figure 16**: Classification of the principal components according to the Logistic Regression classifier (LRC).

**Figure 17**: Classification of the principal components according to the Support Vector Machines classifier (SVC).

**Figure 18**: Classification of the principal components according to the K-Neighbors classifier (KNN).

**Figure 19**: Classification of the principal components according to the Gradient Boosting classifier (GBC).

**Figure 20**: Classification of the principal components according to the Gaussian Naive Bayes classifier (NBC).

**Conclusions**

Identifying locations to install cellular radios in public spaces, based solely on the observed change street crime, is not a complete story without including additional information. For example: the cost of rent and equipment, land usage regulations, and accessibility are just a few additional variables to consider. In addition, the frequency and distribution of crime through space and time is complex. Given the data and information used in the current study, no causal connection between cellular coverage and crime can be made. The current study has shown that change in certain categories of street crime may be more or less correlated with cellular coverage than others. For example, the hypothesis test performed on the difference in means of change in Anti-Social Behaviour between the Strong Cellular Coverage and Weak Cellular Coverage groups, indicate a statistically significant difference between groups. In addition, the observed change in Predictor Crimes ― identified in the feature selection stage ― can be used to predict the cellular infrastructure of a given grid-cell. This sort of information might be useful for mobile carriers, governments, and communities involved in the process of deciding where to install cellular infrastructure.

### References

- The BBC (2012) “UK’s first 4G mobile service launched in 11 cities by EE”. BBC News. https://www.bbc.co.uk/news/technology-20121025/
- Farrell, G., Tilley, N., and Tseloni, A. (2014) “Why the Crime Drop?”, Crime and Justice, Vol. 43, 421–490.
- Garside, J. and Rogers, S. (2012) “EE launches UK’s first 4G mobile network”. The Guardian. https://www.theguardian.com/technology /2012/oct/29/ee-launches-uk-4g-mobile-network/
- Klick, J., MacDonald, J., and Stratmann, T. (2012) “Mobile Phones and Crime Deterrence: An Underappreciated Link”. Research paper no. 12–33. Philadelphia: University of Pennsylvania Law School, Institute for Law and Economics.
- Orrick, E. and Piquero, A. (2015) “Were cell phones associated with lower crime in the 1990s and 2000s?” Journal of Crime and Justice 38:2, 222–234.
- Travis, A. (2018) “Rise in recorded crime is accelerating in England and Wales”. The Guardian. https://www.theguardian.com/uk-news/2018/jan/25/knife-and-gun-rises-sharply-in-england-and-wales/