Project 4: Road Finding¶
by Lorn
Hypothesis¶
It is possible to extract roads from satellite imagery with enough clarity to distinguish them from non-road features.
Data¶
Massachusetts Roads. A dataset of RGB images and rasterized mask of the road segments in that image.
https://www.kaggle.com/datasets/balraj98/massachusetts-roads-dataset
Microsoft Building Footprints. Building footprints in vector data format.
https://planetarycomputer.microsoft.com/dataset/ms-buildings
0.5m DEM of Massachusetts from MassGIS.
https://www.mass.gov/orgs/massgis-bureau-of-geographic-information
import rasterio
import numpy as np
import pandas as pd
import warnings
from pathlib import Path
import geopandas as gpd
from shapely.geometry import box
import matplotlib.pyplot as plt
import folium
import ipywidgets as widgets
from IPython.display import display
import joblib
import ipywidgets as widgets
from IPython.display import display
from scipy.ndimage import sobel, gaussian_filter
from sklearn.ensemble import RandomForestClassifier
from skimage.exposure import rescale_intensity
from skimage.filters import frangi
from skimage.morphology import closing, dilation, rectangle as footprint_rectangle
import warnings
warnings.filterwarnings("ignore")
View #1¶
The first view is a display of the raw data. We'll be working with the MassRoads dataset primarily and joining in additional data when needed. The images in MassRoads are distributed over a variety of small and large cities in the state. Mostly urban and suburban areas.
DATA_DIR = Path("roads")
records = []
for tif in DATA_DIR.rglob("*.tiff"):
with rasterio.open(tif) as src:
if not src.crs:
continue
left, bottom, right, top = src.bounds
geom = box(left, bottom, right, top)
records.append({
"path": str(tif),
"geometry": geom
})
gdf = gpd.GeoDataFrame(records, crs="EPSG:26986")
gdf_ll = gdf.to_crs("EPSG:4326")
center = [42.25, -71.8]
m = folium.Map(location=center, zoom_start=9, tiles="cartodb positron")
for _, row in gdf_ll.iterrows():
geojson = folium.GeoJson(row["geometry"])
geojson.add_to(m)
m
The raw MassRoads dataset. Tiffs with a red, green, blue, and road mask bands.
img = rasterio.open("roads/test/10378780_15.tiff").read()
label = rasterio.open("roads/test_labels/10378780_15.tif").read(1)
plt.figure(figsize=(10,5))
plt.subplot(1,2,1)
plt.imshow(img.transpose(1,2,0))
plt.title("Image")
plt.axis("off")
plt.subplot(1,2,2)
plt.imshow(label, cmap="gray")
plt.title("Label")
plt.axis("off")
plt.show()
Data joined with our 0.5m DEM, terrain features derived from the DEM, and a housing mask.
We used these specific terrain features because we thought they'd have the most contrast between roads and non-roads. A roads slope should be low. It's roughness (the difference in slope in the pixels around each pixel) should also be low. The hillshade was primarily for visualization.
We won't use the housing masks much for extracting roads as our whole hypothesis was about doing so to the exclusion of other features. However the building footprints will help us measure how wrong we are. We did pull the footprint data from Microsoft Planetary Computer.
band_names = [
"Red",
"Green",
"Blue",
"Elevation",
"Housing Mask",
"Slope",
"Hillshade",
"Roughness",
"Road Mask"
]
path = Path("tiffs/17428960_15.tiff")
with rasterio.open(path) as src:
img = src.read()
transform = src.transform
width = src.width
height = src.height
cx = width / 2
cy = height / 2
center_x, center_y = transform * (cx, cy)
bands, H, W = img.shape
fig, axes = plt.subplots(3, 3, figsize=(12, 12))
fig.suptitle(f"Bands for {path.name}", fontsize=16)
axes = axes.ravel()
for i in range(9):
ax = axes[i]
if i < bands:
ax.imshow(img[i], cmap="gray")
ax.set_title(band_names[i], fontsize=12)
else:
ax.axis("off")
ax.set_xticks([])
ax.set_yticks([])
plt.tight_layout()
plt.show()
I think we have sufficient data to prove our hypothesis. I can't really think of anything else available statewide at 1m resolution that would really give us much of an advantage finding roads. As always just having the data does not prove our hypothesis so more analysis is needed.
View #2¶
In the previous view we showed that we had the data. But can you see the roads? In the RGB imagery the roads are clear. They are all greyish and of relatively similar width (with the exception of the interstate).
path = Path("tiffs/17428960_15.tiff")
with rasterio.open(path) as src:
img = src.read()
R = img[0]
G = img[1]
B = img[2]
rgb = np.dstack([R, G, B]).astype(np.float32)
rgb -= rgb.min()
rgb /= rgb.max()
plt.figure(figsize=(20, 20))
plt.imshow(rgb)
plt.title(f"True Color RGB View of {path.name}", fontsize=16)
plt.axis("off")
plt.show()