I'm #hiring for #DataOperations to build and maintain transportation data pipelines and the infrastructure they depend on. We use #Airflow & #Python in #AWS and #RHEL to Extract, Validate, Load & Transform our data in a #PostgreSQL database.
Deadline April 27th.
jobs.toronto.ca/job-invite/6...
Posts by Rafael H. M. Pereira 🚡 Urban Demographics
Streamgraph titled "Name Waves: the Ebb and Flow of American Baby Names," displaying the popularity of the top 10 baby names per sex in the USA from 1950 to 2022, sourced from the US Social Security Administration. Stream width reflects total births. Color encodes peak era: warm tones (yellow, orange, red) for early-peak names and cool tones (pink, purple, blue, teal, green) for recent-peak names. The chart is split into two sections stacked vertically. The upper section, labeled "Girls" in bold teal, shows a wide stream that peaks broadly around the 1980s–1990s before narrowing toward 2022. Names labeled within the streams include Linda, Mary, Susan, Lisa (warm tones, prominent in the 1950s–1960s), Patricia and Jennifer (mid-era, orange to pink), and Sarah, Ashley, Jessica, Elizabeth (cooler tones, prominent from the 1980s onward). The lower section, labeled "Boys," shows a similarly shaped stream widest in the 1950s–1960s and tapering toward 2022. Names labeled include James, John, Robert, William, David, Michael (warm yellow-green tones, dominant in the 1950s–1970s), and Joseph, Matthew, Christopher, Daniel (cooler teal and purple tones, peaking in the 1980s–1990s). Credit text reads: "Data: US Social Security Administration via {babynames} · #30DayChartChallenge 2026 · Day 12 · FlowingData · Ilya Kashnitsky @ikashnitsky.phd."
DAY 12 -- Flowing Data 🌊 #30DayChartChallenge
Explorations of the US names are always fun. Here we look at the most popular names by sex and distinguish them by timing of their peak popularity 🗻
🔗 #rstats code: github.com/ikashnitsky/...
🧙♂️ pplx chat: www.perplexity.ai/search/day-1...
Facility and route optimization on road network graphs can solve countless problems across many industries.
However, these solvers often require expensive software to set up and run them, or expensive, rate-limited travel-time matrix APIs.
The spopt-r R package offers a solution.
The Huff model is the classic algorithm in retail spatial analysis - and you can now use it in R.
Predict:
- Which store a customer is likely to visit
- Sales potential per location
- How new stores reshape the competitive landscape
Learn more: walker-data.com/spop...
I've been waiting to read something like this for a long time. Thanks, @transportist.org !
Diagonal panning video from a very detailed shaded map of Szczecin, Poland. The full map is 20000 x 20000 pixels with 0.5 m resolution and is available at https://shadedmaps.github.io Data source: https://www.geoportal.gov.pl [Head Office of Geodesy and Cartogr...
Não dá pra extrapolar nossos resultados não, a gente fala isso no artigo. Mas outros estudo no mundo olhando para a populacao em idade ativa chegam na mesma conclusao
New in the R mapgl package: static maps!
I developed mapgl to bring my favorite interactive mapping libraries to R. But - interactive maps are by definition hard to share in non-interactive formats.
I'm rolling out a couple new functions I'm already finding useful:
Wonderful! Thanks! We're using Rust in this package but I'm not sure we're following good practices github.com/ipeaGIT/ende... Any feedback here would be super welcome !
Any chance the slides will be shared online ?
Come to @cascadiarevolting.bsky.social and take my Intro to Rust + Extendr workshop ORRRR contribute to base R your call!
I'll only be offended if you don't come! /s
#rstats
Paris has just elected another bike-friendly mayor!
After Anne Hidalgo transformed the city, her PS-colleague Emmanuel Grégoire takes over, beating former right-wing minister Rachida Dati by a large margin.
Which brings me to my research! What we show in this paper is that AI’s ability to produce expert-looking content at zero cost *raises* the demand for experts who can help you tell apart real from fake. 7/
filipecampante.org/wp-content/u...
This is why I'm so excited about my new R/Python package, {freestiler}.
I connect to a 146M row @duckdb.org database; generate vector tiles for 2.8 million jobs in CO from a query; serve the tiles and visualize on a @maplibre.org map.
All in seconds!
Get started: walker-data.com/freestiler
Recomendado por várias pessoas, esse texto na NYTMag sobre a nova era dos agentes de IA é muito bom.
Ele (e algumas conversas entre ddjs) me ajudou a pensar sobre o papel dos agentes de IA numa pesquisa acadêmica de humanidades digitais. (reflexões pessoais abaixo)
www.nytimes.com/2026/03/12/m...
- Measuring exposure to extreme heat in public transit
Paper: www.sciencedirect.com/science/arti...
🔓PDF: www.urbandemographics.org/publication/...
- Estimating public transport emissions from GTFS data
Paper: www.sciencedirect.com/science/arti...
🔓PDF: www.urbandemographics.org/publication/...
My presentation as well as the entire conference is available on Youtube. Here's the link to the recording
www.youtube.com/watch?v=_v19... + and papers👇
I had the opportunity to present two recent studies. One paper looking at the exposure to extreme heat of transit users; and another paper where we proposed an open-source computational model to measure transit emissions from GTFS data +
A couple of weeks ago, @harvardsalata.bsky.social convened a conference on Urban Mobility and Climate Change, bringing together leading experts addressing a wide range of challenges at the intersection of transportation and climate change +
We revised our Skyscraper Revolution paper github.com/Ahlfeldt/DPs...
Added indirect inference to estimate the QoL effect of density by matching causal reduced-form estimates in the model => more realistic counterfactuals. Quick read @voxeu.org: cepr.org/voxeu/column... @bsoeberlin.bsky.social
New in the spopt-r #rstats package: route optimization.
Solve the classic Traveling Salesman Problem for a single driver or optimize a fleet by solving the Vehicle Routing Problem.
Written in Rust so they solve fast.
Read the vignette which covers r5r integration: walker-data.com/spop...
Yes, I'm starting to use duckspatial in some projects but we're planning a major update in the next few weeks, so I'd say it's not ready to use it in production yet. I'm also quite excited with sedona.db it seems the R support is not really a priority
I had (kinda jokingly?) wondered if there was a Skill that helps write Skills... and there is!
From the Anthropic Skills marketplace: github.com/anthropics/s...
#rstats users will be pleased to know that you can now read anything sf can piped directly into SedonaDB via GDAL's @arrow.apache.org integration. This makes the SedonaDB R package considerably more useful!
A dot map of Minneapolis, MN's population by race
A dot map of Minneapolis, MN's population by race, created using data from the 2020 US Census.
🔵 = White, 🟢 = Black, 🟠 = Hispanic, 🔴 = Asian, 🟤 = Native American/Other, 🟣 = Multiracial
Explore the map: www.censusdots.com/race/minneapolis-mn-demo...
Announcing {freestiler}: a high-performance vector tiling engine for R and Python.
Generate vector tiles for your maps directly from R/Python spatial data, @duckdb queries, and local spatial files.
Check out the docs here: walker-data.com/free...
Some highlights:
The Trays, the most beautiful open office space I've seen. At the Harvard Graduate School of Design GSD
We're kicking off March with two data-driven talks, from 🏙️ cities to ⚾ baseball.
𝗥𝗮𝗳𝗮𝗲𝗹 𝗛. 𝗠. 𝗣𝗲𝗿𝗲𝗶𝗿𝗮 explores spatial accessibility and equitable urban policy.
𝗦𝗰𝗼𝘁𝘁 𝗣𝗼𝘄𝗲𝗿𝘀 applies modern stats to baseball, from pitch models to MLB pickoff strategy and bat-tracking data.
tinyurl.com/mpr87m6t
As of 2025, this analysis of privacy policies indicates that every major AI company uses your private conversations to train their models by default. Every prompt, file, photo, personal detail: all of it feeds directly into model training. arxiv.org/pdf/2509.05382