Archived 2025-01-22. Note that this code uses older versons of libraries that now have known security vulnerabilities, and I'm not planning to update the dependencies.
This presentation chronicles my attempts to use Rust to understand how the Green Line Extension will affect commutes in Somerville.
This map shows relative transit times to downtown.
The Green Line Extension will add seven new stations to the north end of the MBTA green line. It will cost about $3 billion and is slated for completion about two years from now.
Map Courtesy Wikipedia user Pi.1415926535, CC-BY-SA 3.0
I want to look at how much the Green Line will change transit times from Somerville to downtown Boston.
I'd like to analyze transit times... so I clearly need to write an OpenStreetMap file ingester and a GPU-based map renderer!?
OpenStreetMap is like Wikipedia for geographical data. Besides literally streets, it contains over 5 billion buildings, parks, train tracks, trees, benches, and much more. This data set powers many maps that you see around the Internet.
Geofabrik GmbH provides downloads of various regions, down to U.S. states. I chose to download the Massachusetts data in .pbf
format, The data is is encoded with Google's protocol buffers, a binary encoding format.
I make use of two OSM concepts:
Data:
amenity
: vending_machine
, vending
: public_transport_tickets
Data:
highway
: tertiary
, name
: Pearl Street
This is a protocol buffer format for efficiently storing OSM data.
It uses a few neat tricks:
The file is organized in triplets:
- #1 block header size in bytes
- #1 block header
- #1 block
- #2 block header size in bytes
- #2 block header
- #2 block
...
message BlobHeader {
required string type = 1; <- There are multiple types of blocks
...
required int32 datasize = 3; <- The size of the upcoming block
}
A new trick for me! Each array is the same length:
message DenseNodes {
repeated sint64 id = 1 [packed = true];
optional DenseInfo denseinfo = 5;
repeated sint64 lat = 8 [packed = true];
repeated sint64 lon = 9 [packed = true];
// Alternates key, value, key, value, ...
repeated int32 keys_vals = 10 [packed = true];
}
Since proto3, scalar numeric fields are packed by default.
Original data: 100
, 101
, 103
, 104
Delta-encoded: 100
, 1
, 2
, 1
Why is this good? With protobuf varints, the space required to encode an integer is proportional to the log of its size. Since OSM contains over 5 billion nodes, we're probably encoding most ints in 1 byte instead of 5 bytes, saving 80% space!
I made my own ingester for OSM PBF data! Not sure if that was a good idea.
Challenges:
Yeah, it probably wasn't a good idea. Although I had trouble using the osmpbf-reader
crate, I'd like to try the osmpbf
crate.
Because the file is split into blocks, I was able to decode each block individually, resulting in a huge speedup.
My model of a typical commute is:
In this model, each resident chooses the station that will result in the lowest total transit time.
Possible extensions:
"As the crow flies" with a constant penalty to account for the fact that people don't fly.
I expect this to introduce some error, especially near difficult-to-pass objects like I-93 and McGrath Highway.
Possible extension: pathfinding with actual sidewalk data
I have defined "downtown" as, "Park Street or Downtown Crossing, whichever is closer."
Existing lines: use schedules, test calibration by hand
GLX: measure distance between stations, comparing to existing D branch of the Green Line. I'm assuming that the trains will move at the same average speed, 19 MPH, because the density of stations is not too far off (thanks JBR!).
After doing this, I found a copy of the 2009 Draft Environmental Impact Report which thankfully confirmed the sanity of these estimates.
Measured in minutes to Lechmere.
I also estimated the typical wait time between trains at each station.
On the low end, Lechmere will have an expected wait only of 2 minutes since it's on both Green Line branches. On the high end, the orange line is more like 9 minutes.
Possible extension: incorporate subway on-time performance
Currently: simply try each station for the given location and use the best one. This isn't always the closest station: for example, if you're halfway between Porter and Magoun, traveling through Porter will be a bit faster.
Possible extension: use heuristics to consider fewer stations
Goals: produce useful and informative maps of the Somerville area.
Two tricky problems here:
The map should:
Rust's 2D graphics landscape is evolving rapidly!
A Guide to Rust Graphics Libraries, As of May 2019 (author of ggez, so this is from the perspective of games)
What graphics API? wgpu-rs
because I am a library hypebeast.
Back ends: Vulkan, Metal, and DirectX
The successor to the deprecated gfx
crate
Strongly inspired by Vulkan, but with a compatibility layer to work with Metal and DirectX. Higher-level than gfx-hal
Selecting a good color palette is important for clarity and accessibility.
I'm using the palette
crate for color space math.
Starting with CIE L*a*b* color space, designed to be perceptually uniform.
To specify colors by hand, using CIE L*C*h°, a cylindrical version of that space.
Use color palettes friendly to color blind people, i.e. don't contrast red vs. green.
I was highly inspired by Color Schemes from Programming Design Systems, a free digital book by Rune Madsen.
I used lyon
to tesselate my graphics (i.e. to turn everything into triangles)
I wrote simple GLSL shaders and compiled them to Vulkan with shaderc
.
Colors:
Let's use an uncertainty of 5 minutes: think about headway, train delays, and modeling inaccuracies
There is an underserved corridor in central and northern parts of Somerville.
Now that corridor is better-served. 👍
How much better-served? In this map, the data ranges from 0 to 20 minutes. The highest deltas can be seen in Union Square, Gilman Square, and southeast of Magoun Square.
This map shows the best MBTA line for each location. This shows that a large chunk of Somerville will be "on the green line."
I used OpenStreetMap and wgpu-rs to analyze MBTA's upcoming Green Line Extension with Rust.
protobuf
and rayon
csv
geo
palette
lyon
wgpu