Python 100 project #19: London Crime Map

United Kingdom is said to be one of the most open country with the respect of data openness. Actually a lot of data is available online, but there are a few problems.

One of them is readability. The dataset itself is available, but those datasets are often difficult to understand. This difficulties come from its terminologies, its structure, and UK’s complicated (and historical) social systems.

It might be an opportunity to look through these figures and make it visualize through python. This time, I use street crime report from, and make a heatmap of those crimes happened on December 2017.


Output Example:


Here is the code:

import glob
import os

import folium
from folium import plugins
import pandas as pd

YEAR = '2017'
MONTH = '12'

# get source file names. data files are separated into respective month.
met_path ="your-local-directory/"
source_csvs = glob.glob("*.csv")

temp_dflist = []
streetcrime_df = pd.DataFrame()
for csv in source_csvs:
    year, month = csv.split('-')[0], csv.split('-')[1]
    df = pd.read_csv(csv, index_col=None, header=0)
    df['Year'] = year
    df['Month'] = month

streetcrime_df = pd.concat(temp_dflist)

# cleanse the data, dropping any rows without location info
streetcrime_df.dropna(subset=['Latitude', 'Longitude'], inplace=True)

# make another dataframe which includes only the data from specified year-month.
periodic_df = streetcrime_df[(streetcrime_df['Year']==YEAR) & (streetcrime_df['Month']==MONTH) & (streetcrime_df['Crime type']=='Violence and sexual offences')]
crime_locations = list(zip(periodic_df.Latitude, periodic_df.Longitude))

# generate map
base_map = folium.Map(location=[51.5074, 0.1277], zoom_start=10)
heatmap = plugins.HeatMap(crime_locations, radius=5, blur=2)


The result is not so self explanatory, as it is almost everywhere. I’m not sure if this is unique to London or it is similar in the other metropolitan cities, but I’m certain that almost everywhere in London there are crimes happening. Sadly, there are no time data available for these dataset.