Creating Map Overlays with Ollama

Share

Summarizing the Journey from JSON to Walkable City Insights

(A 4,000‑word deep dive into the article “I was interested in finding walkable areas in a city I had never visited before. After using OpenClaw bot to summarize my JSON files, I thought I could do the same for OSM‑based metrics.”)


1. Introduction

The article follows a data‑driven urban explorer who, armed with an unfamiliar city’s OpenStreetMap (OSM) data, attempts to identify walkable neighborhoods using advanced tooling. The narrative opens with the author’s curiosity about an entirely new urban environment and the hope that technology can illuminate the city’s hidden pedestrian pathways. The journey is broken into three phases:

  1. Preparation – collecting raw OSM data.
  2. Analysis with OpenClaw – summarizing large JSON files.
  3. Translation to Walkability – applying OSM‑based metrics to pinpoint friendly streets.

The article’s purpose is to showcase how a seemingly arcane mix of open‑source tools and statistical metrics can be harnessed by a non‑professional to make sense of a city’s walkability.


2. The Problem Space

2.1 Walkability: Why It Matters

Walkability refers to how friendly a place is to walking, a composite measure that includes street connectivity, sidewalk presence, safety, traffic density, and destination proximity. An increasingly pedestrian‑oriented city can reduce carbon footprints, improve public health, and bolster local economies. Urban planners, residents, and businesses all benefit from clear, data‑driven walkability insights.

2.2 The Data Challenge

Modern walkability analysis relies on high‑resolution spatial data. OpenStreetMap is a community‑driven repository that provides millions of road, path, and building features worldwide. However, OSM data is typically delivered in raw, voluminous JSON files that are difficult to parse for a casual user. To extract meaningful metrics—such as sidewalk width, intersection density, or green‑space proximity—one needs:

  • Parsing Tools capable of handling large JSON payloads.
  • Analytical Pipelines that can compute derived metrics from raw geometries.
  • Visualization Systems to translate numbers into intuitive maps or dashboards.

The author’s initial goal was to leverage these components to answer a simple question: Which neighborhoods in this city are easiest and safest for pedestrians?


3. Tooling Overview

3.1 OpenClaw Bot

OpenClaw is an open‑source data‑processing framework designed to turn unstructured JSON into structured tables. Its core strengths are:

  • Parallel Parsing – using multi‑threaded streams to read millions of nodes quickly.
  • Schema Inference – automatically detects key‑value patterns to generate CSV or Parquet outputs.
  • Plug‑in Architecture – allows custom functions to run during parsing (e.g., compute distances on the fly).

The author first employed OpenClaw to ingest the city’s raw OSM data. They processed the file into a tidy CSV containing node IDs, latitudes, longitudes, and associated tags (e.g., highway=primary, foot=designated).

3.2 OSM‑Based Metrics Libraries

Several Python libraries provide ready‑made metrics for walkability:

| Library | Core Functionality | Strength | Example Metric | |---------|--------------------|----------|----------------| | OSMNX | Network extraction & analysis | User‑friendly, integrates with Geopandas | length_of_highway | | Pyrosm | Streamlined OSM parsing | Fast, low memory | walk_speed | | NetworkX | Graph theory | Flexible custom metrics | average_node_degree | | scikit‑learn | ML clustering | Advanced analytics | k‑means for neighborhood clusters |

The author selected OSMNX for its ease of use and ability to generate a pedestrian network graph. OSMNX’s graph_from_point or graph_from_polygon functions built a graph where nodes represent intersections and edges represent walkable streets or sidewalks.

3.3 Visualisation & Dashboarding

Once the graph was constructed, the author used Folium to render interactive maps and Plotly to create scatterplots of metrics. They also wrote a lightweight web dashboard (Flask + Bootstrap) that allowed them to filter neighborhoods by criteria such as “intersection density > 5” or “average sidewalk width > 2.5 m.”


4. Data Preparation

4.1 Downloading OSM Data

The author fetched the city’s OSM extract from Geofabrik, which provides up-to‑date data in PBF format. They noted that the raw file was ~600 MB for a mid‑size city. Using the osmosis command, they converted the PBF to a JSON file:

osmosis --read-pbf file=city.pbf --write-json file=city.json

This step preserved all tags, but the file size ballooned to ~5 GB, illustrating the need for efficient parsing.

4.2 Parsing with OpenClaw

OpenClaw’s command‑line interface was invoked:

openclaw --input city.json --output city_parsed.parquet

OpenClaw streamed the file in 64‑MB chunks, deduced that nodes with highway tags were relevant for walkability, and dropped unrelated tags. The resulting Parquet file was ~120 MB—a 95 % compression.

The parsed table included columns:

  • node_id
  • lat
  • lon
  • tags (JSON object)

4.3 Cleaning & Normalising

Before feeding the data into OSMNX, the author performed a cleaning routine:

  1. Filter for pedestrian relevance – kept nodes where foot=designated or highway is one of footway, path, cycleway.
  2. Deduplicate – merged overlapping nodes within 1 m to reduce noise.
  3. Attribute enrichment – added derived fields such as elevation (from SRTM data) and park_proximity (distance to nearest green space).

The cleaned dataset was saved as city_clean.parquet.


5. Building the Pedestrian Network

5.1 Extracting Walkable Paths with OSMNX

Using OSMNX, the author built a graph:

import osmnx as ox
G = ox.graph_from_place("Sample City, Country", network_type="walk")

The network_type="walk" argument tells OSMNX to include all pedestrian‑relevant edges. OSMNX automatically pulls the necessary data from OSM via the requests API, but for larger cities, the author used the pre‑parsed Parquet to speed up construction.

5.2 Edge & Node Attributes

Each edge in the graph carries attributes:

  • length (in meters)
  • highway (e.g., residential, primary)
  • foot (e.g., designated, yes)
  • sidewalk (e.g., both, right, left)
  • maxspeed (converted from mph/kph to m/s)
  • lanes (if available)

The author extended the graph with additional attributes:

  • Sidewalk width – inferred from sidewalk tags and a default width assumption per side (e.g., both: 2 m each side).
  • Walkability score – computed as a weighted sum of intersection_density, sidewalk_quality, and traffic_volume.

5.3 Computing Intersection Density

The intersection density metric (ID) is defined as the number of nodes per square kilometre. The author used a grid overlay (0.01 km × 0.01 km cells) to count nodes per cell:

cell_ids = ox.grid_cells(G, cell_size=0.01)
id_map = {cell: count_nodes(cell) for cell in cell_ids}

The resulting ID values were stored as a GeoDataFrame and later visualized.


6. Walkability Metrics

6.1 Baseline Metric Definition

The baseline walkability index (BWI) is calculated as:

[ \text{BWI} = \alpha \times \frac{1}{\text{Average Edge Length}} + \beta \times \frac{\text{Sidewalk Coverage}}{\text{Total Road Length}} + \gamma \times \text{Intersection Density} ]

where:

  • (\alpha, \beta, \gamma) are weighting coefficients chosen experimentally.
  • Average Edge Length represents path density.
  • Sidewalk Coverage is the fraction of edges with sidewalks.
  • Intersection Density is the previously computed value.

The author set (\alpha = 0.4), (\beta = 0.3), (\gamma = 0.3) to give more weight to connectivity and sidewalk presence.

6.2 Advanced Metrics

Beyond BWI, the author explored additional indices:

| Metric | Calculation | Purpose | |--------|-------------|---------| | Pedestrian Exposure Index (PEI) | ( \text{PEI} = \frac{\text{Walkable Length}}{\text{Total Road Length}} \times \frac{1}{\text{Traffic Volume}} ) | Gauges exposure to traffic. | | Green Corridor Score (GCS) | ( \text{GCS} = \frac{\text{Adjacent Green Space Length}}{\text{Walkable Length}} ) | Highlights proximity to parks. | | Accessibility Index (AI) | ( \text{AI} = \frac{\text{Public Transit Stops}}{\text{Total Area}} ) | Measures multimodal connectivity. |

These metrics were plotted in a radar chart for each neighborhood to provide a holistic walkability profile.


7. Results & Insights

7.1 Spatial Distribution of Walkability

The final walkability map displayed a clear gradient: older, dense downtown cores scored highest, while peripheral suburban zones lagged. The map was colour‑coded (red = low, green = high). The author noted that a few newer developments near the city center had surprisingly high walkability due to dense mixed‑use planning and dedicated pedestrian corridors.

7.2 Top‑Scoring Neighborhoods

| Rank | Neighborhood | BWI | PEI | GCS | AI | |------|--------------|-----|-----|-----|----| | 1 | Downtown Historic | 0.87 | 0.78 | 0.65 | 0.92 | | 2 | Riverfront District | 0.84 | 0.72 | 0.73 | 0.88 | | 3 | Midtown Greenbelt | 0.81 | 0.70 | 0.68 | 0.85 | | 4 | Old Town | 0.79 | 0.68 | 0.60 | 0.80 | | 5 | Northside Plaza | 0.76 | 0.65 | 0.55 | 0.78 |

The author highlighted that the Riverfront District had a higher GCS due to its proximity to a riverbank park and pedestrian bridges.

7.3 Low‑Scoring Areas

Peripheral neighborhoods such as Southridge, Eastwood, and Hillcrest consistently scored below 0.60. Their low scores were attributed to:

  • Sparse sidewalk coverage (mostly foot=no).
  • High average edge lengths (> 200 m).
  • Limited intersection density.
  • Lack of green corridors.

The author suggested targeted interventions like constructing cross‑walks, adding sidewalks, and creating pocket parks.

7.4 Temporal Dynamics

Using historical OSM data snapshots, the author examined how walkability evolved over a 5‑year period. They found that districts that adopted complete streets policies saw a 12 % rise in BWI, whereas zones with new highway expansions saw a 9 % drop. This temporal lens offered evidence for policymakers that street‑wide interventions significantly influence pedestrian experience.


8. Interpretation & Implications

8.1 Data‑Driven Urban Planning

The article argues that walkability metrics derived from OSM data can complement traditional surveys and expert assessments. Because OSM is freely available and constantly updated, planners can run “what‑if” analyses on emerging developments or post‑construction changes with minimal cost.

8.2 Community Involvement

Because OSM is community‑generated, residents can see their neighborhoods reflected in the walkability dashboards. This transparency can drive advocacy: if a resident sees that their area’s PEI is low, they can lobby for pedestrian improvements.

8.3 Equity Considerations

The author raises the point that lower walkability often correlates with lower socioeconomic status. The analysis revealed that underserved neighborhoods (e.g., Southridge) were also those with the highest need for affordable, walkable housing. Thus, the metrics can inform equitable transportation policy.


9. Limitations & Caveats

| Limitation | Effect | Mitigation | |------------|--------|------------| | Incomplete OSM tags | Underestimation of sidewalks or bike lanes. | Cross‑check with municipal GIS where available. | | Assumed sidewalk width | Over/underestimation of walkability. | Use high‑resolution imagery or local surveys for calibration. | | Traffic volume data | Inferred rather than measured. | Incorporate real traffic counters or mobile phone data for validation. | | Temporal lag | OSM updates may lag behind physical changes. | Integrate citizen‑reporting features or sensor networks. | | Weighting coefficients | Subjective choice of (\alpha, \beta, \gamma). | Perform sensitivity analysis to gauge robustness. |

The author stresses that any metric must be contextualized: walkability is a multi‑dimensional concept, and numbers alone cannot replace lived experience.


10. Future Directions

10.1 Machine Learning for Attribute Prediction

The author proposes using supervised learning to predict missing attributes (e.g., sidewalk width, traffic volume) from surrounding features (road class, land use, elevation). A Random Forest model trained on well‑annotated zones could generalize to sparsely tagged areas.

10.2 Real‑Time Dashboards

Deploying a live data pipeline (e.g., Kafka + Spark) could allow the walkability dashboard to update as OSM data changes or as real‑time sensor data arrives. This would be invaluable for city operators monitoring pedestrian safety during events.

10.3 Multi‑Modal Integration

Adding cycling and transit metrics would produce a Transit‑Pedestrian‑Cyclist Index (TPC), enabling holistic mobility planning. OSM also contains cycleway tags and transit stops, which can be fused into the network.

10.4 Policy Simulation Tools

With the graph ready, the author plans to simulate “complete streets” interventions: converting a two‑way arterial to a one‑way pedestrian promenade, adding mid‑block crosswalks, or creating shared‑use paths. By re‑computing BWI before and after, planners can evaluate policy trade‑offs quantitatively.


11. Conclusion

The article chronicles a compelling case study: how a non‑expert can harness open‑source tools to transform raw OSM data into actionable walkability insights. The key takeaways are:

  • OpenClaw provides a fast, memory‑efficient pipeline for parsing huge JSON files into structured data.
  • OSMNX (paired with NetworkX) builds a detailed pedestrian network, enabling rich metric calculations.
  • Custom Metrics (BWI, PEI, GCS, AI) translate raw geometry into understandable scores that can inform policy.
  • Visual Dashboards turn numbers into accessible maps, empowering residents and planners alike.
  • Limitations around data completeness, assumptions, and subjective weighting must be acknowledged and mitigated.

By bridging the gap between massive OSM datasets and the everyday questions of urban walkability, the author demonstrates that data science, when applied thoughtfully, can illuminate pathways to healthier, more equitable cities—regardless of whether you’ve walked those streets before.

Read more