CapRemap: Transit Equity Analysis

Cesar Yahia


This notebook analyzes CapRemap's impact on minority populations by comparing the service before and after transit realignment.

CapRemap was part of CapMetro's Connections 2025 transit plan. Prior to implementation, CapMetro's conducted a service equity analysis (SAFE), and it showed that the service change is compliant with FTA's Title VI requirement. In spite of that, activists are adamant that the service change was inequitable. For more details on CapRemap refer to this blog post

The following analysis deviates from CapMetro's study by looking at stop-level changes in service. At each stop, the change in frequency is precisely evaluated by measuring the change in the number of doors opening before and after CapRemap. To determine the characteristics of affected demographics and the coverage area, the proposed approach uses a buffer with a 1/4 mile radius around each stop. In addition, different from CapMetro's method, this analysis includes service changes to all routes-- not just ones that pass CapMetro's service change and disparate impact thresholds.

In [1]:
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import geopandas as gpd
import pandas as pd
from bokeh.models import ColumnDataSource, HoverTool, ColorBar, WheelZoomTool, Span, Label
from bokeh.plotting import figure
from bokeh.transform import factor_cmap, linear_cmap
from bokeh.tile_providers import get_provider, Vendors
from bokeh.io import output_notebook, show, save
from bokeh.layouts import row, gridplot
from bokeh.util.hex import hexbin
import seaborn as sns
import utils as u  # importlib.reload(u)
import time
import importlib
%matplotlib inline
output_notebook()
Loading BokehJS ...

Demographic Data

The Demographics division of the City of Austin's Planning and Zoning department compiled demographic data from the US Census Bureau and the American Community Survey, and then uploaded this data in the form of shapefiles on ArcGIS Online. This data represents 2012-2016 5-year estimates of income, population, and race distribution in Austin, TX.

This data will be used in the following analysis to measure the % minorities within stop buffers. According to FTA circular guidelines, minorities are defined as the total population minus non-Hispanic Whites.

In [2]:
census = gpd.read_file('../shapefiles/ACS-PopRace/Total Population.shp')  # import ACS-Census data
# the crs is {'init': 'epsg:3857'} which is (pseudo) web-mercator with WGS84 datum, this projected coordinate system is suitable for plotting but not calculations
census.head()
Out[2]:
GEOdisplay TOTAL_POP WHITE BLACK LATINOHISP ASIAN OTHERRACE MULTIRACIA NATIVEAMER HAWAIIAN_P SymbolID SHAPE_Leng SHAPE_Area geometry
0 Census Tract 109.01, Hays County, Texas 7960.0 6085.0 253.0 1377.0 64.0 14.0 103.0 0.0 64.0 None 48409.230018 1.130095e+08 POLYGON ((-10895219.922 3521915.754, -10890904...
1 Census Tract 108.09, Hays County, Texas 5393.0 4349.0 0.0 967.0 50.0 4.0 23.0 0.0 0.0 None 41152.848304 7.651386e+07 POLYGON ((-10908679.015 3529037.211, -10908884...
2 Census Tract 109.09, Hays County, Texas 6502.0 4059.0 177.0 1896.0 120.0 11.0 239.0 0.0 0.0 None 10449.217082 6.744037e+06 POLYGON ((-10894708.046 3506062.749, -10895016...
3 Census Tract 109.06, Hays County, Texas 13567.0 6212.0 1009.0 5965.0 91.0 0.0 287.0 3.0 0.0 None 40118.295024 8.133303e+07 POLYGON ((-10890555.700 3504678.714, -10890126...
4 Census Tract 103.03, Hays County, Texas 8042.0 4257.0 786.0 2661.0 19.0 0.0 225.0 87.0 7.0 None 26751.340606 2.788089e+07 POLYGON ((-10897586.982 3488394.879, -10897570...

Add the median household income and total number of households attributes

In [3]:
MedHousInc = gpd.read_file('../shapefiles/ACS-MedHousInc/Median Household Income.shp')
census = census.merge(MedHousInc[['GEOdisplay', 'Households', 'MHI']], on='GEOdisplay')
census.head()
Out[3]:
GEOdisplay TOTAL_POP WHITE BLACK LATINOHISP ASIAN OTHERRACE MULTIRACIA NATIVEAMER HAWAIIAN_P SymbolID SHAPE_Leng SHAPE_Area geometry Households MHI
0 Census Tract 109.01, Hays County, Texas 7960.0 6085.0 253.0 1377.0 64.0 14.0 103.0 0.0 64.0 None 48409.230018 1.130095e+08 POLYGON ((-10895219.922 3521915.754, -10890904... 2840.0 94663.0
1 Census Tract 108.09, Hays County, Texas 5393.0 4349.0 0.0 967.0 50.0 4.0 23.0 0.0 0.0 None 41152.848304 7.651386e+07 POLYGON ((-10908679.015 3529037.211, -10908884... 1914.0 106096.0
2 Census Tract 109.09, Hays County, Texas 6502.0 4059.0 177.0 1896.0 120.0 11.0 239.0 0.0 0.0 None 10449.217082 6.744037e+06 POLYGON ((-10894708.046 3506062.749, -10895016... 2205.0 77132.0
3 Census Tract 109.06, Hays County, Texas 13567.0 6212.0 1009.0 5965.0 91.0 0.0 287.0 3.0 0.0 None 40118.295024 8.133303e+07 POLYGON ((-10890555.700 3504678.714, -10890126... 4203.0 60351.0
4 Census Tract 103.03, Hays County, Texas 8042.0 4257.0 786.0 2661.0 19.0 0.0 225.0 87.0 7.0 None 26751.340606 2.788089e+07 POLYGON ((-10897586.982 3488394.879, -10897570... 2993.0 31494.0

As evident in the data description, the data includes the total population and the distribution of this population across races. First, let's start by getting the minority population and percent minority population, and then plot the geographic distribution of minorities.

In [4]:
census['minority'] = census['TOTAL_POP'] - census['WHITE']  # the attribute 'WHITE' refers to non-Hispanic Whites
census['prop_minority'] = census['minority'] / census['TOTAL_POP']  # get the proportion of minorities in each census tract
census['prop_black'] = census['BLACK'] / census['TOTAL_POP']
census['prop_latino'] = census['LATINOHISP'] / census['TOTAL_POP']
census['prop_asian'] = census['ASIAN'] / census['TOTAL_POP']
census['prop_white'] = census['WHITE'] / census['TOTAL_POP']
census.head()
Out[4]:
GEOdisplay TOTAL_POP WHITE BLACK LATINOHISP ASIAN OTHERRACE MULTIRACIA NATIVEAMER HAWAIIAN_P ... SHAPE_Area geometry Households MHI minority prop_minority prop_black prop_latino prop_asian prop_white
0 Census Tract 109.01, Hays County, Texas 7960.0 6085.0 253.0 1377.0 64.0 14.0 103.0 0.0 64.0 ... 1.130095e+08 POLYGON ((-10895219.922 3521915.754, -10890904... 2840.0 94663.0 1875.0 0.235553 0.031784 0.172990 0.008040 0.764447
1 Census Tract 108.09, Hays County, Texas 5393.0 4349.0 0.0 967.0 50.0 4.0 23.0 0.0 0.0 ... 7.651386e+07 POLYGON ((-10908679.015 3529037.211, -10908884... 1914.0 106096.0 1044.0 0.193584 0.000000 0.179307 0.009271 0.806416
2 Census Tract 109.09, Hays County, Texas 6502.0 4059.0 177.0 1896.0 120.0 11.0 239.0 0.0 0.0 ... 6.744037e+06 POLYGON ((-10894708.046 3506062.749, -10895016... 2205.0 77132.0 2443.0 0.375731 0.027222 0.291603 0.018456 0.624269
3 Census Tract 109.06, Hays County, Texas 13567.0 6212.0 1009.0 5965.0 91.0 0.0 287.0 3.0 0.0 ... 8.133303e+07 POLYGON ((-10890555.700 3504678.714, -10890126... 4203.0 60351.0 7355.0 0.542124 0.074372 0.439670 0.006707 0.457876
4 Census Tract 103.03, Hays County, Texas 8042.0 4257.0 786.0 2661.0 19.0 0.0 225.0 87.0 7.0 ... 2.788089e+07 POLYGON ((-10897586.982 3488394.879, -10897570... 2993.0 31494.0 3785.0 0.470654 0.097737 0.330888 0.002363 0.529346

5 rows × 22 columns

Now let's do some plotting, starting by a plot of proportion minorities versus median household income. A very clear negative correlation is apparent.

The figure options include zoom, reset view, and save

In [5]:
p=figure(x_range=(min(census['MHI']), max(census['MHI'])), y_range=(0,1), plot_height=550, plot_width=550, tools="pan,wheel_zoom,reset,save")
source = ColumnDataSource(census.drop(columns=['geometry']))
p.circle('MHI', 'prop_minority', size=8, color='darkgrey', alpha=0.5, source=source)
p.title.text = "Prop. Minority vs. Median Household Income"
p.xaxis.axis_label = 'Median Household Income ($)'
p.yaxis.axis_label = 'Proportion Minority in Census Tract'
p.xgrid.visible = False
print("There is a negative linear (pearson's) correlation of:", census['prop_minority'].corr(census['MHI']) )
show(p)
There is a negative linear (pearson's) correlation of: -0.5689368646780201

Let's do a geographic plot-- hover above the map below to find the proportion Black or Latino in each census tract

In [6]:
census['x'] = census['geometry'].apply(u.getPolyCoords, coord_type='x')
census['y'] = census['geometry'].apply(u.getPolyCoords, coord_type='y')
x_range = (min(min(census['x'])), max(max(census['x'])))
y_range = (min(min(census['y'])), max(max(census['y'])))

tile_provider = get_provider(Vendors.CARTODBPOSITRON)  # get the tiles from bokeh

## proportion minorities mapper
mrt_mapper = linear_cmap('prop_minority', 'Viridis256', min(census['prop_minority']), max(census['prop_minority']))
tooltips = [
    ("Prop. Minority", "@prop_minority"),
    ("Prop. Black", "@prop_black"),
    ("Prop. Latino", "@prop_latino")
]
plot_prop = figure(title="Proportion of Minorities", x_range=x_range, y_range=y_range, plot_height=550, plot_width=550,
           x_axis_type="mercator", y_axis_type="mercator", tools="pan,wheel_zoom,reset,hover", tooltips=tooltips)
plot_prop.grid.visible = False
plot_prop.add_tile(tile_provider)
census_source = ColumnDataSource(census.drop(columns=['geometry']))
plot_prop.patches('x', 'y', source=census_source, 
          fill_color=mrt_mapper,
         fill_alpha=0.5, line_color="black", line_width=0.05)

color_bar = ColorBar(color_mapper=mrt_mapper['transform'], width=12,  location=(0,0))
plot_prop.add_layout(color_bar, 'right')
show(plot_prop)

Transit

The GTFS data is available on the City of Austin's Open Data website (which links to the State of Texas Open Data portal). This data can be used for measuring the change in service after implementation of CapRemap. The Open Data portal provides GTFS data for the Apr-Jun service period (pre-CapRemap) and for the Jun-Aug service period (post-CapRemap). For geographic analysis and visualization, the Open Data portal also provides shapefiles of the transit system pre-CapRemap and post-CapRemap

Generally, CapMetro revises its services every January, June, and August. In typical cases, major changes are implemented in August and minor ones are also implemented in January and June. However, in 2018, CapMetro implemented CapRemap in June-- a major overhaul of their system. Additional changes to support CapRemap were also implemented later in August. Earlier in the beginning of the year, they adjusted service on January 7th, but the Jan 2018 service period was divided further into two periods: Jan-Mar and Apr-Jun. This analysis uses the Apr-Jun GTFS data from the January service period since it directly preceded CapRemap.

A detailed description of the service changes (6-3-2018) can be found here.

Note: in addition to the usual GTFS data, CapMetro provides a stop_times_sup.txt file that gives the departure times of each trip at every stop; this file is convenient for measuring the change in the number of doors opening at a stop.

First, let's get the data for the January 2018 service period:

In [7]:
# GTFS data Apr-Jun, the term 'Jan' is used to distinguish this data
tripsJan18 = pd.read_csv('../shapefiles/Jan2018/Jan2018-Apr-Jun/trips.txt')
stopsTimesJan18_sup = pd.read_csv('../shapefiles/Jan2018/Jan2018-Apr-Jun/stop_times_sup.txt')

# shapefiles uploaded to data.austintexas.gov, Jan-Mar service period
stopsJan18 = gpd.read_file('../shapefiles/Jan2018/Stops.shp')  # crs is 'epsg:32614', UTM Zone 14 projection with WGS84 datum, units of meters
routesJan18 = gpd.read_file('../shapefiles/Jan2018/Routes.shp')  # crs is 'epsg:32614', https://epsg.io/32614

Now, let's get the data for the June 2018 service change

In [8]:
# data below without the "Jan" qualifier corresponds to June 2018 data after the CapMetro Remap
# GTFS data
trips18 = pd.read_csv('../shapefiles/Jun2018/Jun2018/trips.txt')
stopsTimes18_sup = pd.read_csv('../shapefiles/Jun2018/Jun2018/stop_times_sup.txt')

# shapefiles uploaded to data.austintexas.gov
stops18 = gpd.read_file('../shapefiles/Jun2018/Stops.shp')  # crs is 'epsg:32614'
routes18 = gpd.read_file('../shapefiles/Jun2018/Routes.shp')  # crs is 'epsg:32614'

For a stop-level analysis of service, one approach is to create buffers with 1/4 mile radii around each stop. The demographics within those buffers represent the population affected by the service changes

Note that the data is imported in a UTM Zone 14 projection (WGS84 datum); in contrast to web-mercator, this projection is suitable for distance-based computations. The shapefiles will be converted back to web-mercator subsequently for visualization.

In [9]:
# get buffer for post-ReMap stops, a 1/4 mile is about 400 meters
stops18['buffer'] = stops18['geometry'].buffer(400)  # can compute buffer accurately since data is in UTM Zone 14 with WGS84 datum
stops18 = stops18.set_geometry('buffer').drop(columns=['geometry'])  # set the buffer as the active geometry and drop the centroid as geometry
# now do the same for january stops
stopsJan18['buffer'] = stopsJan18['geometry'].buffer(400)
stopsJan18 = stopsJan18.set_geometry('buffer').drop(columns=['geometry'])

Then, the departure times data is trimmed to include weekday morning peak service only. To do so, the departure time data in stops_times_sup.txt is merged with the route_id and service_id in the trips.txt file. Then, the weekday service_id's are extracted, and those are restricted further to departures between 7 a.m. and 10 a.m.

The weekday service_id's for the Apr-Jun (pre-CapRemap) data are as follows: '119-1', '119-12301' (i.e., Weekday, UT Monday-Thursday)

The weekday service_id's for the Jun (post-CapRemap) data are as follows: '115-1', '115-5001', '115-5004' (i.e., Weekday, UT Summer Weekday, UT Summer Weekday)

In [10]:
# merge departure times with trips info to be able to thin departure times based on weekday peak-hour trips
stopsTimesJan18_sup = stopsTimesJan18_sup.merge(tripsJan18[['route_id', 'trip_id', 'service_id']], on=['trip_id'], how='left')
stopsTimes18_sup = stopsTimes18_sup.merge(trips18[['route_id', 'trip_id', 'service_id']], on=['trip_id'], how='left')

# used merged dep. time dataframes to filter out service ID's that are not during the week day
stopsTimesJan18_sup = stopsTimesJan18_sup[stopsTimesJan18_sup['service_id'].isin(['119-1', '119-12301'])]  # find week-day trips for departure times
stopsTimes18_sup = stopsTimes18_sup[stopsTimes18_sup['service_id'].isin(['115-1', '115-5001', '115-5004'])]

# find morning peak departures
peak_hour = (7,10)  # between 7:00 a.m. and 10:00 a.m.
stopsTimesJan18_sup['is_peak'] = stopsTimesJan18_sup['departure_time'].apply(u.isPeak, peak_hour=peak_hour)
stopsTimes18_sup['is_peak'] = stopsTimes18_sup['departure_time'].apply(u.isPeak, peak_hour=peak_hour)
stopsTimesJan18_sup = stopsTimesJan18_sup[stopsTimesJan18_sup['is_peak']==1]
stopsTimes18_sup = stopsTimes18_sup[stopsTimes18_sup['is_peak']==1]  # now the departure time geodataframes correspond to morning peak weekday service

# get the stop IDs used during the weekday morning peak
ph_stopsJan18 = set(stopsTimesJan18_sup['stop_id'])
ph_stops18 = set(stopsTimes18_sup['stop_id'])

# use the stop ids to get removed stops, new stops, and kept stops-- restricted to morning peak!
removed_stops = ph_stopsJan18 - ph_stops18
new_stops = ph_stops18 - ph_stopsJan18
kept_stops = ph_stopsJan18.intersection(ph_stops18)

Let's do some plots to find the new stops and the removed stops

In [11]:
# first, convert to the web-mercator crs
stops18 = stops18.to_crs({'init': 'epsg:3857'})
stopsJan18 = stopsJan18.to_crs({'init': 'epsg:3857'})
routesJan18 = routesJan18[routesJan18['SERVICE_ID'].isin(['117-1', '117-12301', '117-23501'])].to_crs({'init': 'epsg:3857'})
# the routes shapefile for the January service period is based on the Jan-Mar data instead of the Apr-Jun
# this shouldn't make any difference as it's only used for visualization purposes
In [12]:
# get coords for the stops
stops18['x'] = stops18['buffer'].apply(u.getPolyCoords, coord_type='x')
stops18['y'] = stops18['buffer'].apply(u.getPolyCoords, coord_type='y')
stopsJan18['x'] = stopsJan18['buffer'].apply(u.getPolyCoords, coord_type='x')
stopsJan18['y'] = stopsJan18['buffer'].apply(u.getPolyCoords, coord_type='y')
# get coords for the routes
routesJan18['multi'] = routesJan18['geometry'].apply(u.checkMulti)
routesJan18['x'] = routesJan18['geometry'].apply(u.getLineCoords, coord_type='x')
routesJan18['y'] = routesJan18['geometry'].apply(u.getLineCoords, coord_type='y')
In [13]:
# filter out the new and removed stops
new_stops_gdf = stops18[stops18['STOP_ID'].isin(new_stops)]  # get new stops
removed_stops_gdf = stopsJan18[stopsJan18['STOP_ID'].isin(removed_stops)]  # get removed stops

The figure below shows that many removed stops are in areas with a high minority population (East and South of Austin)

But, looking at removed or added stops only may be biased. For example, removal of a stop that had low service is shown while major reductions in service at other stops is not shown. In their transition to a high frequency network, CapMetro may have not added many stops but significantly improved the frequency at existing stops-- this change will not be shown as well.

In [14]:
# set up
x_range = (min(min(removed_stops_gdf['x'])), max(max(removed_stops_gdf['x']))  )
y_range = (min(min(removed_stops_gdf['y'])), max(max(removed_stops_gdf['y']))  )


plot_stopDiff = figure(title="Added and Removed Stops", x_range=x_range, y_range=y_range, plot_height=550, plot_width=550,
           x_axis_type="mercator", y_axis_type="mercator", tools="pan,wheel_zoom,reset")
plot_stopDiff.grid.visible = False
plot_stopDiff.add_tile(tile_provider)


r1 = plot_stopDiff.patches('x', 'y', source=census_source, 
          fill_color=mrt_mapper,
         fill_alpha=0.2, line_color="black", line_width=0.05)
plot_stopDiff.add_layout(color_bar, 'right')
plot_stopDiff.add_tools(HoverTool(
    tooltips=tooltips, renderers=[r1]
))



# plot routes
#### start with single
rJan18_single = routesJan18[routesJan18['multi']==0]
singleJan_route_source = ColumnDataSource(rJan18_single.drop(columns=['geometry']))
plot_stopDiff.multi_line('x', 'y', source = singleJan_route_source, color='orange', line_width=1)
#### do multi
for index, df_row in routesJan18[routesJan18['multi']==1].iterrows():
    plot_stopDiff.multi_line(df_row['x'], df_row['y'], color='orange', line_width=1)



# add the stops
#### new
new_stops_source = ColumnDataSource(new_stops_gdf.drop(columns=['buffer']))
n_stop = plot_stopDiff.patches('x', 'y', source=new_stops_source, 
          fill_color='green', line_color='black',
         fill_alpha=0.3)
#### removed
removed_stops_source = ColumnDataSource(removed_stops_gdf.drop(columns=['buffer']))
r_stop = plot_stopDiff.patches('x', 'y', source=removed_stops_source, 
          fill_color='red', hatch_pattern='left_diagonal_line', hatch_color='black', hatch_alpha=0.3, hatch_weight=0.5,
         line_color='black', fill_alpha=0.3)


show(plot_stopDiff)

Then, let's compute the change in number of doors opening at each stop.

In [15]:
# gets a gdf with the weekday morning peak stops
postRemap_stops_phwd = stops18[stops18['STOP_ID'].isin(ph_stops18)]
preRemap_stops_phwd = stopsJan18[stopsJan18['STOP_ID'].isin(ph_stopsJan18)]

# count number of buses stopping in the morning peak
## post remap
postRemap_counts = postRemap_stops_phwd['STOP_ID'].apply(u.getBusCount, dep_times=stopsTimes18_sup)
postRemap_stops_phwd = postRemap_stops_phwd.assign(BUS_COUNTS_POST = postRemap_counts)
## pre remap
preRemap_counts = preRemap_stops_phwd['STOP_ID'].apply(u.getBusCount, dep_times=stopsTimesJan18_sup)
preRemap_stops_phwd = preRemap_stops_phwd.assign(BUS_COUNTS_PRE = preRemap_counts)

Now merge the files and account for new/removed stops to get an 'impact' variable that is defined as follows:

impact = (buses passing stop post-CapRemap) - (buses passing stop pre-CapRemap)

If a stop is new it will have zero pre-CapRemap stops, otherwise, if it is removed then it will have zero post-CapRemap stops.

In [16]:
# first merge on kept stops
preRemap_stops_phwd_kept = preRemap_stops_phwd[preRemap_stops_phwd['STOP_ID'].isin(kept_stops)]
postRemap_stops_phwd_kept = postRemap_stops_phwd[postRemap_stops_phwd['STOP_ID'].isin(kept_stops)]
total_stops_phwd = preRemap_stops_phwd_kept.merge(postRemap_stops_phwd_kept[['STOP_ID', 'BUS_COUNTS_POST']], on=['STOP_ID'], how='left')

# now add the preRemap removed stops to the total_stops gdf
preRemap_stops_phwd_removed = preRemap_stops_phwd[preRemap_stops_phwd['STOP_ID'].isin(removed_stops)]
preRemap_stops_phwd_removed = preRemap_stops_phwd_removed.assign(BUS_COUNTS_POST=[0]*preRemap_stops_phwd_removed.shape[0])  # removed stops have zero buses passing post-CapRemap
total_stops_phwd = total_stops_phwd.append(preRemap_stops_phwd_removed, sort=True)  # append to total_stops

# add the new stops in postRemap to the total_stops gdf
postRemap_stops_phwd_new = postRemap_stops_phwd[postRemap_stops_phwd['STOP_ID'].isin(new_stops)]
postRemap_stops_phwd_new = postRemap_stops_phwd_new.assign(BUS_COUNTS_PRE=[0]*postRemap_stops_phwd_new.shape[0])  # new stops have zero buses passing pre-CapRemap
total_stops_phwd = total_stops_phwd.append(postRemap_stops_phwd_new, sort=True)

# get the difference in the number of buses stopping at each stop before and after remap
total_stops_phwd['impact'] = total_stops_phwd['BUS_COUNTS_POST'] - total_stops_phwd['BUS_COUNTS_PRE']

Proceed to plot the difference in buses stopping after CapRemap. This figure may not be very informative, but zooming in it is possible to get the change in service at every stop!

In [17]:
## diff mapper
diff_mapper = linear_cmap('impact', 'Plasma256', min(total_stops_phwd['impact']), max(total_stops_phwd['impact']))
tooltips = [
    ("Bus Count Diff", "@impact")
]
plot_diff = figure(title="Bus Count Difference", x_range=x_range, y_range=y_range,plot_height=550, plot_width=550,
           x_axis_type="mercator", y_axis_type="mercator", tools="pan,wheel_zoom,reset,hover", tooltips=tooltips)
plot_diff.grid.visible = False
plot_diff.add_tile(tile_provider)
stop_source = ColumnDataSource(total_stops_phwd.drop(columns=['buffer']))
plot_diff.patches('x', 'y', source=stop_source, 
          fill_color=diff_mapper,
         fill_alpha=0.2, line_color="black")



color_bar = ColorBar(color_mapper=diff_mapper['transform'], width=12,  location=(0,0))
plot_diff.add_layout(color_bar, 'right')
show(plot_diff)

Plot a distribution of the service changes

In [18]:
phist=figure(title="Change in the Number of Doors Opening at a Stop", plot_height=550, plot_width=550)
histo, edges = np.histogram(total_stops_phwd['impact'], density=True, bins=30)
phist.quad(top=histo, bottom=0, left=edges[:-1], right=edges[1:],
     fill_color="#036564", line_color="#033649")
phist.y_range.start = 0
phist.xaxis.axis_label = '(Doors Opening after CapRemap) - (Doors Opening before CapRemap)'
phist.yaxis.axis_label = 'Proportion'
phist.x_range.range_padding = 0.03
show(phist)

Let's get stops that had more than 10 buses removed during CapRemap or more than 10 buses added after CapRemap. This could be a better representation of the service changes.

In [19]:
total_stops_phwd['BUS_DIFF_BOOL'] = total_stops_phwd['impact'].apply(u.getBusDiffBool, threshold=10)  # assigns +1 if 10 additional buses, -1 if lost more than 10 buses, and 0 if change is between -10 and 10
neg_thresh_total_stops = total_stops_phwd[total_stops_phwd['BUS_DIFF_BOOL'] == -1]
pos_thresh_total_stops = total_stops_phwd[total_stops_phwd['BUS_DIFF_BOOL'] == 1]
In [20]:
x_range = (min(min(neg_thresh_total_stops['x'])), max(max(neg_thresh_total_stops['x']))  )
y_range = (min(min(neg_thresh_total_stops['y'])), max(max(neg_thresh_total_stops['y']))  )


plot_stopDiff = figure(title="Net Loss or Gain of More than 10 Buses at the Morning Peak", x_range=x_range, y_range=y_range, plot_height=550, plot_width=550,
           x_axis_type="mercator", y_axis_type="mercator", tools="pan,wheel_zoom,reset")
plot_stopDiff.grid.visible = False
plot_stopDiff.add_tile(tile_provider)

tooltips = [
    ("Prop. Minority", "@prop_minority"),
    ("Prop. Black", "@prop_black"),
    ("Prop. Latino", "@prop_latino")
]

r1 = plot_stopDiff.patches('x', 'y', source=census_source, 
          fill_color=mrt_mapper,
         fill_alpha=0.3, line_color="black", line_width=0.05)
color_bar = ColorBar(color_mapper=mrt_mapper['transform'], width=12,  location=(0,0))
plot_stopDiff.add_layout(color_bar, 'right')

plot_stopDiff.add_tools(HoverTool(
    tooltips=tooltips, renderers=[r1]
))



# plot routes
#### start with single
rJan18_single = routesJan18[routesJan18['multi']==0]
singleJan_route_source = ColumnDataSource(rJan18_single.drop(columns=['geometry']))
plot_stopDiff.multi_line('x', 'y', source = singleJan_route_source, color='orange', line_width=1)
#### do multi
for index, df_row in routesJan18[routesJan18['multi']==1].iterrows():
    plot_stopDiff.multi_line(df_row['x'], df_row['y'], color='orange', line_width=1)



# add the stops
pos_stops_source = ColumnDataSource(pos_thresh_total_stops.drop(columns=['buffer']))
plot_stopDiff.patches('x', 'y', source=pos_stops_source, 
          fill_color='green', line_color='black',
         fill_alpha=0.3)


neg_stops_source = ColumnDataSource(neg_thresh_total_stops.drop(columns=['buffer']))
plot_stopDiff.patches('x', 'y', source=neg_stops_source, fill_color='red', hatch_pattern='left_diagonal_line',
                                 hatch_color='black', hatch_alpha=0.3, hatch_weight=0.5, line_color='black', 
                                 fill_alpha=0.3)




show(plot_stopDiff)

Stop-Level Demographic Attributes

The census data gives the number of minorities and the corresponding percent minority in each census tract. Since the influence of service changes is limited to 1/4 mile buffers around stops, these demographics characteristics should be measured at the stop level.

As shown in the figure below, the stop demographics can be computed by (1) finding the census tracts that overlap with the stop buffer, and (2) weighting the variables by the area of intersection between the stops and the census tracts. In the example below, 25% of the stop lies in Tract 1 and the remainder is in Tract 2. The percent minority for the stop is the area-weighted percent minority in each tract. As for the number of minorities in the stop, if it is assumed that the minority population is uniformly distributed throughout the tract, then the proportion of total minority population that is within the stop can be computed through multiplying the total population by the ratio of $\frac{\text{area of intersection between stop and tract}}{\text{tract area}}$.

In general, the equations for computing stop level demographics are given below:

In [21]:
# project to UTM Zone 14 for accurate area computations
census = census.to_crs({'init': 'epsg:32614'})
total_stops_phwd = total_stops_phwd.to_crs({'init': 'epsg:32614'})
total_stops_phwd = total_stops_phwd.reset_index().drop(columns=['index'])

WARNING: The following methods would take a bit of time to compute, on the order of few minutes each.

In [22]:
total_stops_phwd['prop_minority'] = total_stops_phwd['buffer'].apply(u.getStopPropDemog, census=census, group='prop_minority')
total_stops_phwd['minority'] = total_stops_phwd['buffer'].apply(u.getStopDemog, census=census, group='minority')
In [23]:
total_stops_phwd['prop_white'] = total_stops_phwd['buffer'].apply(u.getStopPropDemog, census=census, group='prop_white')
total_stops_phwd['white'] = total_stops_phwd['buffer'].apply(u.getStopDemog, census=census, group='WHITE')
In [24]:
total_stops_phwd['prop_black'] = total_stops_phwd['buffer'].apply(u.getStopPropDemog, census=census, group='prop_black')
total_stops_phwd['black'] = total_stops_phwd['buffer'].apply(u.getStopDemog, census=census, group='BLACK')

Service Evaluation

As previously discussed, the impact of CapRemap at a particular stop can be computes as impact=(buses passing post-CapRemap) - (buses passing pre-CapRemap). To isolate improvement/reduction in service, let's distinguish positive impact (doors opening) from negative impact (doors closing); in particular, doors opening is the net increase in buses passing at the AM peak when the post-CapRemap number of buses is greater than the pre-CapRemap value (the doors closing metric is similarly defined). For clarity, those stop-level service change descriptors are shown below:

From that, proceed to calculate aggregate measures representing the impact on minorities. The Expected Impact shown below is the average service change experienced by a minority. In other words, if a minority person is randomly chosen within the impact area, the expected service change they will receive is given by Expected Impact. Note that the denominator, the total number of minorities in the coverage area, could not be computed by simply summing the minorities within each stop since many stops overlap and this would result in counting the same person more than once. Instead, the coverage area is determined by the union of all the stops, and the total minorities in this aggregate shapefile is computed by finding the proportion of minorities in each tract that intersects the coverage area. The numerator is proper as shown since each person experiences the impact of all service changes that are within walking distance.

While the Expected Impact metric is insightful, it depends on the density of minorities across the network. In general, it may not be adequate to make equity metrics depend on the density of population. For example, greatly improving service in an area with a large number of minorities but leaving out others unconnected would give a large positive Expected Impact, but this may be undesirable.

Instead, the Frac DO and Frac DC metrics are not density-based since they are derived from the percent minorities in each stop (i.e. not the absolute number of minorities around a stop). The Frac DO metric gives the fraction of the total doors opening (service improvements) that are provided to minorities (i.e., the proportion of service improvements that benefit minorities). Similarly, the Frac DC metric represents the fraction of total doors closing to minorities.

In [25]:
total_stops_phwd['doors_opening'] = total_stops_phwd['impact'].apply(u.getDoorsOpening)
total_stops_phwd['doors_closing'] = total_stops_phwd['impact'].apply(u.getDoorsClosing)

Find the total coverage area, and the total population of minorities in the coverage area.

In [26]:
coverage_area = total_stops_phwd['buffer'].unary_union
total_minority = u.getStopDemog(coverage_area, census, group='minority')
total_white = u.getStopDemog(coverage_area, census, group='WHITE')
total_black = u.getStopDemog(coverage_area, census, group='BLACK')

Expected impact on minorities. If a minority person was sampled at random, this is their expected change in the number of buses passing during the morning peak.

In [27]:
expected_impact_minorities = (  ( (total_stops_phwd['impact'])*(total_stops_phwd['minority'])  ).sum()  )/( total_minority   )
expected_impact_minorities
Out[27]:
-5.541645845119857

Expected impact on White people. If a White person was sampled at random, this is their expected change in the number of buses passing during the morning peak.

In [28]:
expected_impact_white = (  ( (total_stops_phwd['impact'])*(total_stops_phwd['white'])  ).sum()  )/( total_white  )
expected_impact_white
Out[28]:
-6.09813708922815

Expected impact on Black people. If a Black person was sampled at random, this is their expected change in the number of buses passing during the morning peak.

In [29]:
expected_impact_black = (  ( (total_stops_phwd['impact'])*(total_stops_phwd['black'])  ).sum()  )/( total_black   )
expected_impact_black
Out[29]:
-4.254418702599492

Fraction of doors opening (service improvements) that went to areas with minorities

In [30]:
frac_do_minorities = (  ( (total_stops_phwd['doors_opening'])*(total_stops_phwd['prop_minority'])  ).sum()  )/(total_stops_phwd['doors_opening'].sum())
frac_do_minorities
Out[30]:
0.5465004435837215

Fraction of doors opening (service improvements) that went to areas with White people

In [31]:
frac_do_white = (  ( (total_stops_phwd['doors_opening'])*(total_stops_phwd['prop_white'])  ).sum()  )/(total_stops_phwd['doors_opening'].sum())
frac_do_white
Out[31]:
0.45114944514975097

Fraction of doors opening (service improvements) that went to areas with Black people

In [32]:
frac_do_black = (  ( (total_stops_phwd['doors_opening'])*(total_stops_phwd['prop_black'])  ).sum()  )/(total_stops_phwd['doors_opening'].sum())
frac_do_black
Out[32]:
0.09887292513334757

Fraction of doors closing (service reductions) that went to areas with minorities

In [33]:
frac_dc_minorities = (  ( (total_stops_phwd['doors_closing'])*(total_stops_phwd['prop_minority'])  ).sum()  )/(total_stops_phwd['doors_closing'].sum())
frac_dc_minorities
Out[33]:
0.5234510249333159

Fraction of doors closing (service reductions) that went to areas with White people

In [34]:
frac_dc_white = (  ( (total_stops_phwd['doors_closing'])*(total_stops_phwd['prop_white'])  ).sum()  )/(total_stops_phwd['doors_closing'].sum())
frac_dc_white
Out[34]:
0.47329738170489416

Fraction of doors closing (service reductions) that went to areas with Black people

In [35]:
frac_dc_black = (  ( (total_stops_phwd['doors_closing'])*(total_stops_phwd['prop_black'])  ).sum()  )/(total_stops_phwd['doors_closing'].sum())
frac_dc_black
Out[35]:
0.07839881288321765

Conclusions

It is evident that CapRemap did not significantly improve service throughout the network; in fact, the expected change in service on all demographics was negative-- indicating that, on average, Austin's residents observed less buses passing during the morning peak. While CapRemap added high frequency routes, this was at the expense of other non-frequent service.

In terms of equity, there does not seem to be any bias against minorities or Black people. The results show that Black people had the least negative impact when counting the expected number of buses passing during the AM peak. The fraction of service improvements that went to areas with Black people was low (only 9.8% of the total service improvements). However, at 7.8%, the fraction of service reductions that was inflicted on areas with Black people was also low. In general, minority areas were allotted 55% of the total service improvements (doors opening) and they received 52% of the total service reductions. Meanwhile, areas with White people were allotted 45% of the total service improvements and they received 47% of the total service reductions. These results show that minority areas did not simultaneously receive a lower fraction of the service improvements and a greater fraction of the service reductions, which indicates that there is no apparent bias against minorities in the distribution of service changes.