For more, check out our article at IJGIS: doi.org/10.1080/1365...
Or the open-access preprint here: osf.io/qepc6_v1
Posts by Geoff Boeing
Our goal is not to replace state-of-the-art congested travel models, but to equip less-resourced planners, scholars, and community advocates with a free, open, and accurate tool for accessibility analysis, scenario planning, and evidence-based interventions when resources are limited.
Whereas a naïve model under-predicts travel time by >3 minutes on average, our model mis-predicts by <1 second on average and achieves an out-of-sample MAPE of ~8%, similar to far more data-intensive approaches.
Using LA as a case study, we combine open data on street networks, speed limits, traffic controls, and turns with a small training sample of empirical travel times from the Google Routes API.
We argue that planners and applied researchers need a cheaper, easier middle ground to predict minimally-congested but accurate travel times:
- a method that uses free, open data
- runs on ordinary hardware
- and substantially improves accuracy over old naïve approaches
At the other extreme, state-of-the-art models in computer science and transportation engineering can achieve really good accuracy, but often require billions of observations, deep learning models, and massive computational resources and capacity.
Planners often still rely on "naïve" methods (e.g., minimizing Euclidean distance, network distance, or speed limit based traversal time) that systematically under-predict real driving times. This is a problem if it makes driving seem unrealistically fast relative to transit, biking, or walking.
My article "Travel Time Prediction from Sparse Open Data" has just been published in the International Journal of Geographical Information Science. We tackle a longstanding problem of how to predict realistic driving travel times without access to expensive proprietary data.
buff.ly/vvSgm6s
For more, check out the article: doi.org/10.1103/1vj4...
Or the open-access preprint: arxiv.org/abs/2509.21931
Our proposed model starts with a minimum spanning tree (the initial backbone) then adds edges iteratively (the subsequent urbanization) to match empirical degree distributions. We find that this successfully reproduces key empirical characteristics of real-world street networks.
It turns out that most generative models don't capture this fundamental characteristic well. So we propose a new generative model of urban street networks, that is, a model that generates street networks which reproduce this distinguishing feature.
In other words, most network nodes have very few shortest paths that depend on them, but there are some nodes on which a very high number of shortest paths depend. Theoretically, we explain this as a street network that started as a backbone road, then grew/filled in as the area around it urbanized.
A distinguishing feature of urban street networks is their extreme betweenness centrality heterogeneity. That means that street networks are particularly prone to chokepoints (nodes that many trips get funneled through).
Marc Barthelemy and I recently published an article in Physical Review Letters titled "Universal Model of Urban Street Networks."
We present an algorithm to generate urban street networks that reproduce key empirical characteristics.
Paper link: doi.org/10.1103/1vj4...
Here's a short summary...
Curtis Sliwa is the gift that keeps on giving.
Properly embedding a spatial network on a curving surface -- capturing the unique underlying topography -- gives us better models of systems, like cities, that are built on such irregular surfaces.
Free open-access link to the paper: doi.org/10.1093/pnas...
We solve this by instead modeling urban networks embedded on the surface of a curving 2D manifold (that is, the topography of the underlying land), then we examine lazy paths and graph arduousness to show how elevation affects betweenness centrality, path ruggedness, and overall network efficiency.
Most spatial network models are embedded in a 2D plane, but this useful fiction discards important information: variations in a city's 3D elevation affect network connectivity, centrality, and difficulty of travel.
Steep hills make it hard to walk or build streets, and we need to account for that.
I recently published a paper in PNAS Nexus with Marc Barthelemy, Alain Chiaradia, and Chris Webster that addresses the problem of urban networks' elevations: they seem simple enough, but modeling them can be tricky.
Open-access paper link: buff.ly/Mt7u2QG
Here's a short summary...
Counting is hard, but we can make it a little easier by using better models. For more, check out the open-access article: doi.org/10.1111/tgis...
These algorithms’ information compression drastically improves downstream graph analytics’ memory and runtime efficiency, boosting analytical tractability without loss of model fidelity.
This article presents OSMnx’s algorithms to automatically simplify spatial graphs of urban street networks—via edge simplification and node consolidation—resulting in faster parsimonious models and more accurate network measures like intersection counts and densities, street lengths, & node degrees.
Mitigating these 3 problems is a project I’ve been iteratively refining for the past decade. It was a central focus of my dissertation and a key motivation for originally developing OSMnx.
If unaddressed, my assessment shows that typical intersection counts (and downstream densities) would be overestimated by >14%, but very unevenly so in different parts of the world. This bias’s extreme heterogeneity particularly hinders comparative urban analytics.
This causes spatial uncertainty due to data challenges in representing network nonplanarity, intersection complexity, and curve digitization. Essentially all data sources suffer from at least 1 of these problems in representing divided roads, slip lanes, roundabouts, interchanges, turning lanes, etc
Street intersections, particularly the complex kind common in modern car-centric urban areas, are fuzzy objects for which most data sources do not provide a simple 1:1 representation.
Street intersection counts and densities are ubiquitous measures in transport geography and planning. However, typical street network data and typical street network analysis tools can substantially overcount them. This article explains the 3 main reasons why this happens and presents solutions.
But counting is *hard* because defining that set and identifying its members are often nontrivial tasks. Many of the world’s most important analytics rely far less on flashy data science techniques than they do on counting things well and justifying those counts effectively.
Most real-world objects belong to fuzzy categories, resulting in subjective decisions about what to include or exclude from counts. Yet this complexity is often obscured by a superficial impression that counting is easy to do because its mechanics seem easy to understand.
How many street intersections do you see in this figure? I published an article recently in Transactions in GIS (open-access: buff.ly/TZoFrrf) and its first sentence sums it up: "Counting is hard." Hear me out... It really is!