Election day was last Tuesday. I’ll do a full analysis of the results in a later post, but first I want to talk about the live election tracker, which you all participated in beyond my wildest dreams. How did it do?
I developed and tuned the model without having any data that would look like what we actually got on election day: I had final vote counts by division, but not the time-series—how many people voted by 8am? How many by 3pm?—that I would need to use on election day. Now that I have that, let’s evaluate how the model worked.
Here was my last prediction from Tuesday night, for Democratic turnout.
I don’t have the final Democratic turnout yet, but the final overall count was… (drumroll)… 172,466. For Democrats and Republicans combined. At historic rates, this would mean about 162 thousand Democratic votes. That’s well outside my margin of error, nearly 4 standard errors below the estimate. Whoops.
So what happened? I’ve dug into the model and broken the results into three categories: the good, the bad, and the really really stupid.
First, let me remind you what the model was. Residents across the city submitted three data points: their Ward+Division, the time of day, and the voter number at their precinct.
I modeled the cumulative votes V for submission i, in Division d, at time t, using the log scale:
log(V_idt) = alpha_t + beta_d + e_i
This appears easy to fit, but the sparsity of the data makes it quite complicated. We don’t observe every division at every time, obviously, so need to borrow information across data points. I used an E-M style algorithm with following parametrizations:
alpha_t ~ loess(t)
beta_d ~ MultiVariateNormal(mu, Sigma)
e_i ~ normal(0, sigma_e)
This model implicitly assumes that all divisions have the same distribution of votes through the day. For example, while divisions can have different vote totals overall, if one division has 30% of its votes by 9am, then all do (plus or minus noise). I did not enforce that the time series should be non-decreasing, though obviously it must be. Instead, I figured the data would largely corroborate that fact. This does mean that the estimated trend can in general go down, due to noise when the true trend is flat. Eh, it largely works.
As for the divisions, mu is forced to have mean 0, so the overall level of voting is absorbed into alpha. I use mu and Sigma estimated on the historic final turnout counts of each division, and the covariance matrix of that turnout. I discuss what that covariance looks like here.
The final turnout is estimated as exp(alpha) * sum(exp(beta_d)), ignoring the small correction for asymmetry in exp(e_i).
Evaluating the Model: The good, the bad, and the really really stupid.
Good: your participation.
We ended the day with 641 submissions. Holy shit. Y’all are awesome.
[Note: I’m including some straggler submissions that came in through the rest of the night, which is why this result and what’s below don’t match the night-of results exactly. None of this changes the substantive results]
Really Really Stupid: What does Voter Count represent?
I announced that I was estimating Democratic turnout, because I remembered someone from the last primary saying that voter numbers are broken down by party. Turns out that’s wrong. And I never bothered to confirm. So the whole time I said I was estimating Democratic turnout, the model was actually estimating the total number of voters. So I should have been pointing to that overall 172 thousand popular count, and that's a lot closer to my estimate. From here on out, let's pretend I didn't mess that up, and use the 172K as the test comparison.
Bad (but defensible): Smoothing and the thunderstorm.
I never imagined I would have over 600 data points. To give you a sense of how many points I was expecting, I built and tested the model on datasets of 40. At that size, it was really important to strongly borrow information across time and divisions, and not overfit single points. So I hard-coded very strong smoothing into the time series. Notice how the time plot above keeps going up in a straight line from 5pm to 8pm.
It turns out (because y'all are amazing) that I had enough data points to use much less smoothing, and let the model identify more nuanced bends in the data. That would have helped a lot. Here’s what the plot looks like if I use much less smoothing:
You may also have noticed the giant thunderstorm that blanketed the city at 7pm. It turns out that voting completely flattened that hour. The over-smoothed model just assumed the linear trend would keep on going.
When I fit the model with the more adaptive smoothing parameter, the prediction would have been 173,656 (CI: 157,150 - 192,058). That’s… really close.
Good: Ward relative turnout
On the beta side of the model, things turned out pretty well. Here is the plot I created with the change in votes from 2014.
Here is a recent tweet from Commissioner Schmidt’s office.
The maps match, albeit with 39 and 26 switching places as well as 18 and 31. (Full disclosure: the size of the increase I was showing in Ward 26 made me *really* worried on election day that I had messed something up. Turns out that increase was real).
It's worth emphasizing how cool this is: with 641 submissions, many from before noon, we were able to almost perfectly rank wards based on their increased turnout.
Another way to evaluate the spatial part of the model is to compare my Division-level predictions with the true turnout. Below is a plot of my final estimate for each division on the x-axis, with the difference between the true turnout and my predicition (the residual) on the y-axis.
The residuals are centered nicely around zero. Notice that I only had data from 329 divisions, so 1,357 of these estimates came in divisions for which we observed no data, and were entirely borrowed from other divisions based on historic correlations.
Okay (?) ¯\_(ツ)_/¯: Confidence Intervals
In statistics, it's usually a lot easier to estimate a value than it is to put error bars around it. And a point estimate is of no use without a measure of its uncertainty. I've got a somewhat complicated model; how did my purported uncertainty do?
One way to test this out is the bootstrap. You sample from your observed data over and over with replacement, to create simulated, plausible data sets. You can then calculate your point estimate as if that were your data, and look at the distribution of those estimates. Voila, you have an estimate of the uncertainty in your method. The benefit of this is that you can mechanically explore the uncertainty in your full process, rather than needing to rely on math that uses perhaps-invalid assumptions.
This model is not a perfect use case of the bootstrap because it relies so heavily on having data from a variety of divisions. The bootstrap will necessarily provide fewer divisions than the data we have, because data points get repeated. Thus, we would expect the bootstrap uncertainty to be larger than the true uncertainty in the model with real data.
The bootstrap CIs are 35% larger than the estimated CIs I provided. I frankly don't have a great sense of whether this means my method underestimates the uncertainty, or if this is due to having fewer divisions in the typical bootstrap sample. I need to break out some textbooks and explore.
This project had some errors this time around, but it seems like with some easy fixes we could build something that does really, really well. Here are some additional features I hope to build in.
While you guys are awesome, some of you still made mistakes. Here is a plot from 12:30:
I was pretty sure that negatively many people didn't vote at lunch. That trend is entirely driven by that one person who reported an impossibly low number in their division at 12:30.
Here's another plot:
Someone claimed that they were the 10,821,001,651st voter in their division. That is larger than the population of the world and seems implausible.
Through the day I implemented some ad-hoc outlier detection methods, but for the most part my strategy was to manually delete the points that were obviously wrong. But there are some points that are still unclear, and which I ended up leaving in. I hope to build in more sophisticated tests for removing outliers by November.
Predicting final outcomes
Because nobody had collected time-of-day data on Philadelphia voting before, I couldn't predict the end of day turnout. This means, for example, at 3pm I could estimate how many people had voted as of 3pm, but was not predicting what final turnout would look like. I simply didn't know how voting as of 3pm correlated with voting after 3pm. But *now*, we have one election's worth of time-of-day data! I am going to work up an analysis on the time-of-day patterns of voting (stay tuned to this blog!) and see if it's reasonable to add a live, end-of-day prediction to the tool.
See you in November!
I'm going to put aside the live modelling business for a while, and spend the next few months looking at static analyses of these results, and what they might mean for November. But don't worry, the Live Election Tracker will be back in November!
Share your voter number at bit.ly/sixtysixturnout!
Philadelphia's 2018 Primary elections are coming on May 15th. I'm excited to announce that Sixty-Six Wards will be *live tracking* turnout on election day. And I need help from you!
What I need from you
On election day, vote! When you sign in to vote, you can see what number voter you are in your division. After voting, log in to at bit.ly/sixtysixturnout and share with me your (1) Ward, (2) Division, (3) Time of Day, and (4) Voter Number. Using that information, I've built a model that estimates the turnout across the city (see below).
You'll then be able to track the live election results at jtannen.github.io/election_tracker. Will turnout beat the 165,000 who voted in 2014, even with a non-competitive Senate and Governor primary? Will the surge seen in 2017 continue?
This estimation can only get better with more data points. So encourage your friends to vote, and share their voter number too!
Some note about the data collection: I only collect the four data points above (Ward, Division, Time, and Voter Number), and no identifying information on submitters. I *will* share this data publicly--again, only those four questions--in hopes that it can prove useful to others. And I am only using Democratic primary results (sorry Republicans, but there are simply too few of you, especially in off-presidential years, for me to think I could make any valid estimate).
Now for what you really care about: the math.
Estimating turnout live requires simultaneously estimating two things: each Division's relative mobilization and the time pattern of voters throughout the day. The 100th voter means something different in a Division that had 50 voters in 2014 than it does in a Division that had 200, and it means something different at 8am than it does at 7pm. Further, Philadelphia has 1,686 Divisions, and I don't think we'll get data on every Division (no matter how well my dedicated readers blast out the link). I use historic correlations among Divisions to guess the current turnout in Divisions for which no one has submitted data.
To estimate turnout, I model the turnout X in division i at time t, in year y, as
log(X_ity) = a_y + b_ty + d_iy + e_ity
The variable a represents the overall turnout level in the city for this year. b_t is the time trend, which starts at exp(b_t) = 0 and finishes at exp(b_t) = 1 so the time trend progresses from 0% of voters having voted at 7am to 100% having voted at 8pm. d_i represents Division i's relative turnout versus other divisions for this year, and e_it is noise.
To best estimate d_i, especially for Divisions with no submitted data, I use historical data. Using the Philadelphia primaries since 2002 (excluding 2009, where something is weird with the data), I estimate each Division's average relative turnout versus other Divisions (its "fixed effect"), and the correlation among Divisions' log turnout across years. Divisions are often very similar to each other: when one Division turns out strongly in an election, similar Divisions do too.
Here are the estimated average relative turnouts (the fixed effects):
We see a familiar pattern. Center City, Germantown and Mount Airy, and Overbrook all vote at disproportionately high rates, while the universities, North Philly, and the northern River Wards vote at disproportionately low rates.
That's across all years. But the way Divisions over- or under-perform these averages in a given year creates patterns as well. Divisions' turnouts are correlated with each other.
We have 1,686 Divisions and only 16 years, so we need to simplify the covariance matrix to estimate covariance among Divisions. To do that, I use Singular-Value Decomposition to identify three dimensions of turnout. These dimensions represent groups of Divisions that swing together: when one Division in the group turns out higher than usual, the others do to. The signs are not meaningful; some years the Divisions with a positive sign turn out higher, other years those with negative signs. What's important is that the positive and negative signs move oppositely.
SVD assigns a score in each dimension for the dimensions and for the years. Divisions with a positive score in a dimension turn out more strongly in years with positive scores, Divisions with a negative score turn out more strongly in years with negative scores.
Eyeballing the score maps together with the years' scores serves as a sanity check for years, and provides intuition to the underlying story. The dimensions are ordered from the strongest separation to the weakest. I'm not going to pay too close attention to the specific values of the scores, what matters most is the relative values.
Dimension 1 has clearly identified the racial divide in the city. Divisions with positive scores are predominantly Black and Hispanic, while divisions with negative scores are predominantly White (again, the signs are not meaningful). Divisions with positive scores voted disproportionately in 2012 and 2003 (President Obama's and Mayor Street's reelections, respectively), while divisions with negative scores turned out particularly strongly in 2017. Interestingly, the year with the lowest score, meaning the greatest disproportionate turnout in non-Hispanic White neighborhoods, was the 2017 DA's race won by Larry Krasner.
It's less obvious from the map what Dimension 2 captures, but the time series is clear: Dimension 2 identifies Divisions where relative turnout has steadily increased over the 16 years. This also lines up with the neighborhoods that have gentrified: Powelton, Fishtown, Fairmount, and Girard Estates have seen the strongest trends up, while broad swaths of North Philly and the Greater Northeast have seen relative decreases over time.
Dimension 3 identified Divisions that surge in turnout specifically for Presidential primaries. Penn and Drexel obviously see the strongest swings, though Southwest Philly and Hispanic sections of North Philly also have positive scores: these neighborhoods voted strongly in 2016 and 2008, relative to their typically low overall turnout.
These dimension allow me to calculate the smoothed covariance matrix Sigma, among divisions. In a given year, then, the vector of Division effects, d, is drawn from a multivariate Normal:
d ~ MVNormal(mu, Sigma),
with mu the Division fixed effects.
The time series of voting throughout the day is currently completely unknown. What fraction of voters vote before 8am? What fraction at lunch time? Knowing how to interpret a data point at 11am hinges on the time profile of voting. For this model, I assume that all Divisions have the same time profile: all divisions have the same fraction of their voters vote before time t, with possible noise. Since I don't have historical data for this, I will estimate it on the fly.
I model the total time effect including the annual intercept, a_y + b_t, using a loess smoother.
The full model
Having specified each term in
log(X_ity) = a_y + b_ty + d_iy + e_ity,
I fit this model using Maximum Likelihood.
The output is a joint modeling of (a) the time distribution across the city and (b) the relative strength of turnout in neighborhoods. Come election day, you'll be able to see estimates of the current turnout, as well as how strong turnout is in your neighborhood!
See you on Election Day!
2018 could be an exciting moment for Philadelphia elections. The primary will give us an important signal for what to expect in November. Please help us generate live elections by voting, sharing your data, and getting your friends to do the same. See you May 15th!
Back in December, I looked at how many votes it takes to become a Democratic Party Committeeperson. Philadelphia's Committeepeople are the foot soldiers of the Party, responsible for getting out the vote and organizing the party in the 1,686 Divisions. Each Division has 2 committeepeople, so every four years a potential 3,372 Philadelphians are elected to the post (I focus on Democrats here, though the same is true for Republicans). In 2014, 348 (10%) of those positions went completely unfilled, while another 275 of the positions were won by Write-In candidates, usually in districts with less than two candidates on the ballot.
An open question for the upcoming May Primary is whether Democrats' newfound energy will translate to the rank-and-file positions of local the local Party. Well, applications have been filed and the Commissioner's office released the official slate of candidates.. How do the numbers bode for that trickle-down energy?
The hypothetical surge in Committeeperson candidates definitely did not materialize.
The surge in candidates didn't happen
In total, there are currently 3,204 candidates on the Democratic ballot, after a number of applications were contested and rejected. That compares to 3,098 that survived to the election in 2014. The counts are from slightly different points in the process. The 2014 data uses election results of non-write-in candidates, while the 2018 uses the recently released Commissioner's data on candidates who survived potential challenges.
There are currently a total of 558 seats that have no candidate on the ballot, 61 fewer than four years ago.
Some 1,615 of those candidates are incumbents. To calculate incumbent candidates, I use fuzzy text matching on candidate names between 2014 and 2018. This tests whether two names are the same based on the fraction of characters that are different. Matching is harder than it may seem because of variability of how candidates write out their names: spellings may change, a candidate may identify as Junior in one but not another, Elizabeth may change her listed name to Betsy. Rather than manually assign incumbency, I automate the process; 've spot checked the fuzzy matching on 40 borderline matches and think I've got a good first-order approximation to incumbency assignments.
The wards with the highest number of candidates per division are in Wards 1 and 2 in Queens Village, 55 in the lower Northeast, and 46 in West Philly. The wards with the lowest include 27 and 20, which include Penn and Temple, respectively.
The most astonishing increase is in the 58th Ward in the Northeast, which has 66 more candidates in 2018 than it did in 2014. Ward 42 saw the the greatest decrease
There has not been an obvious surge in energy across the city. 34 of the 66 wards saw increases in the total number of candidates, while 29 saw a decrease.
Which Wards have the energy?
Maybe to see the energy, we need to look in specific places in the city. I often find two useful ways to break up the city: by race, and by vote in the 2016 primary. Race often captures an important axis for identity and experience in the city, while vote in the 2016 primary does two things: differentiates White wards between more establishment (Clinton voters) and less establishment voters (Sanders), and differentiates predominantly Black wards that nonetheless are in the process of gentrifying. Wards 46 and 47 neighbor Penn and Temple, respectively, and are predominantly Black Wards that voted relatively strongly for Sanders, largely because of the sizeable young White population.
There is one dimension that is not captured by the Race x 2016 primary distinction, and that is the Trumpiness of White voters. For example, many White Northeast Wards voted for Sanders, but then also swung towards Trump in the general election. This suggests that their Sanders votes may not have been a declaration of progressivism, but a vote against Clinton. Looking at Sanders-voting White wards will conflate the young progressives in the center of the city with the anti-Clinton voters of the Northeast.
For reference in the upcoming discussion, here is a map of wards' predominant race and ethnicity, calculated using the 2012-2016 American Community Survey.
Below is a plot of the average number of 2018 committeeperson candidates per division, plotted by predominant race and 2016 Primary vote. For the most part, White wards have the most candidates, followed by Black wards and then last Hispanic wards. The Black ward with the most candidates is ward 46, which is actually a rapidly gentrifying ward that Bernie almost won.
The most interesting trend is within Hispanic wards, where the trend in White and Black wards is reversed: the number of candidates on the ballot increases with vote for Clinton. Wards 7, 19, and 43 voted strongest for Clinton in the city, and have around 2 candidates per Division. I read this reversal as demonstrating that political organization is different within Hispanic wards from the others. In Hispanic wards, party organization is correlated with establishment votes, while the connection is less clear in other types of wards.
So how about the change in the number of candidates? Did this newfound energy make Bernie-loving wards mobilize committeepeople en masse?
Finally, those changes in candidates running also mean that White wards have the most non-incumbents running. Some 42% of candidates in White wards are incumbents, compared to 53% in Hispanic wards and 57% in Black wards.
What to look for in the Primary
The May 15th primary will see many new Committeepeople be seated, and decide the which Democrats run for all of the important (newly-redistricted) U.S. House and the PA State House and Senate races. The surge in energy that many predicted didn't seem to materialize in residents running for committeeperson, but the primary will give us a much better sense of who is energized where, ahead of the national November midterms (and race for governor).
I've got some exciting news planned for the primary. Stay tuned!
I've been profiling each of the new Congressional Districts created when the state Supreme Court declared the prior boundaries unconstitutionally gerrymandered. Today I'm profiling the last of the Congressional Districts in the Philadelphia area, the new District 01.
District 01 mostly aligns with Bucks County, to the Northeast and North of the city. To accomodate the equal population requirement, it adds on Montgomeryville and Hatfield in Montgomery County.
The district is the most evenly split in the Philadelphia region. It voted narrowly for Clinton in 2016, by a slim 50.7 - 49.3 margin. This was gap was two percentage points more Democratic than the state as a whole, though the District was 5.5 point *less* Democratic than the state in 2014. Bucks County provides the prime example of a suburban swing district, with traditional Republicans who swung against Trump. (Of course, the swing did not include all Republican voters by any means, but in this district a few percentage points matters.)
The district is predominantly White, and there is not a single State House District within it that is not at least a plurality White. Within that White population, there are demographic differences. The region immediately outside of Philadelphia looks a lot like an extension of the Northeast: it is the densest part of the County, and less wealthy than the County's center, around Doylestown. The lowest five statehouse districts, including Newtown, Churchville, and everything below, constitutes a whopping 46% of the population.
That 46% of the population turns out at lower rates than the rest of the District, and only represents 42% of the votes. But in a district so evely divided, subtle swings in any region with 42% of the vote (and especially a turnout increase, which is plausible in district with such low baseline turnout) can determine the election.
Despite the low turnout South of the district, the District as a whole votes at much higher rates than the state. Measured as votes per population over 18, the district voted at a rate nine percentage points more than the state in 2016, and six points more in 2014, the last race for Governor.
The 2016 Democratic Primary illustrates some interesting splits. Consider the wealthy region around Newtown and Lambertville. It has very high turnout, and was evenly split between Clinton and Trump. However, voters there *strongly* supported Clinton over Sanders. In other districts, we've seen a correlation between support for Sanders and support for Trump, which I've interpreted as an anti-establishment (or anti-Clinton, depending on your reading) sentiment. However, these wealthy voters appear to be legitimate centrists: with a slight overall Republican lean, who voted against Sanders, while also swinging slightly against Trump.
Below are the racial splits for the District, though they deserve a strong word of caution. The calculation below assigns races the weighted average of the vote in the State House districts that residents live in. In a District so heavily White, the Black, Hispanic, and Asian residents will still live in a predominantly White district, so the differences between races presented will be understated.
This wraps up the Philadelphia District Profiles. The redistricting removed the gerrymandering that was fabricating Republican Districts out of a broadly Democratic region. The result is that every one of the five compact districts in the region would have voted for Clinton in 2016, ranging from narrow victories (today's CD 01) to the Democratic strongholds in the state (CDs 02, 03). While the state as a whole still represents a disproportionate Republican overrepresentation--Republicans would have won 56% of these districts in 2016, when they won only 50% of the vote--they are dramatically closer to matching the popular vote.
Tomorrow is the special election to replace Tim Murphy in the House of Representatives. Since I've got the machinery to analyze districts, I thought I'd prep some maps to see what to expect. The election will be held using the old districts, not the Supreme Court's new districts, and in any other year the Republican would almost certainly win. Donald Trump carried it by 19.5% just 20 months ago. The district ran 18.8 percentage points more Republican than the state in 2016 and 18.3 in 2014. But recent polls imply that this race is close. I'm not going to narrate, but thought I'd share the plots I made for myself.
I'm profiling each of the State Supreme Court's new Congressional Districts in the Philadelphia area, looking at their voting behavior and their demographics. Today, the new District 04.
District 04 covers Montgomery county, in Philadelphia's suburbs. This county had been among the most gerrymandered in the state, and saw the biggest changes under the Supreme Court's map. It's a politically diverse county, and chopping it up provided a huge boon to the Republicans. It's a swing-y county, and gets national attention as a pivotal suburb that seems to be trending Democratic.
The county combines Democratic neighborhoods in the southeast with Republican neighborhoods in the northwest. However, that doesn't end up being the relevant distinction to make. The northwest neighborhoods are sparsely populated, and represent very few votes. Instead, the most important distinction is between the heavily Democratic suburbs just outside of Philadelphia--Elkins Park, Glenside, Abington--and the marginally Democratic suburbs in the center. The GOP strategy had been to waste the votes of the former by lumping them in with all-Democrat Philadelphia, while distributing the latter with Republican districts to create safe-but-not-too-safe Republican districts.
Here's how the county used to be divided. It includes Goofy's head of the famed former District 7.
The new district is reliably Democratic.
The county is predominately White and higher income. The racial exception is Norristown, and the wealth exception is the more rural area in the northwest.
Turnout in 2016 came disproportionately from the inner suburbs. That's where the population is, but also has the highest turnout per resident.
This November is a Gubernatorial election. The turnout falls, but proportionately less than in the rest of the state. It also falls less in the southeast, so those neighborhoods are *even more* important in elections for Governor.
As I've pointed out in every one of these profiles, the Trump vote closely matches the Sanders vote. The district went 59-41 for Clinton over Sanders, a bigger Clinton win than the state overall. That was largely driven by the southeast.
The racial cross-tabs are less interesting for this district than others, mostly because it's so White. Perhaps most interesting is the stability of Hillary's primary numbers across races; she doesn't seem to have done quite so well in Black neighborhoods in the county as she did in Philadelphia.
This week, I'm profiling each of the State Supreme Court's new Congressional Districts, looking at their voting behavior and their demographics. Today, the new District 05.
District 05 is the first district we're looking at that stretches outside of Philadelphia. In total, 80% of its population comees from Delaware County, 16% from Philadelphia, and 4% from Montgomery. (That area in South Philly is deceiving; much of it is industrial and has no population).
The district contains portions of Bob Brady's old District 1, which used to stretch out to Chester in order to gerrymander Democratic votes together. It is much less gerrymandered now, though still doesn't have any Republican strongholds.
Turnout for the district is high, running six percentage points higher than the state as a whole. That largely is due to the wealthiest suburbs, where 70-80% of the over-18 population votes.
They also fall off less than the rest of the state between Presidential and Gubernatorial elections.
Again, much of the interesting story of the district is in the 2016 Democratic Primary. The district voted largely for Clinton, with a pattern that we saw in other Districts: Black neighborhoods overwhelmingly supported Clinton, wealthier White neighborhoods still supported her by around 20 percentage points, and middle income White neighborhoods and students swung the hardest towards Bernie (though still ended up at close to an even split).
This district displays the largest over-representation of White voters that we've seen so far; they represent 64% of the population, but 69% of the vote in 2014. We will see if that continues in this high-attention Gubernatorial race this November.
This week, I'm profiling each of the State Supreme Court's new Congressional Districts, looking at their voting behavior and their demographics. Today, the new District 03.
The district is the most diverse of Philadelphia's, and perhaps of the state. It combines the affluent neighborhoods of Center City and Fairmount with West Philly, and then reaches up to Germantown, Chestnut Hill and Manayunk.
The demographic hodge podge of neighborhoods has one thing in common: all are Democratic strongholds. District 3 becomes the most Democratic in the state, and would have been won by Clinton in 2016 by 92-8 (yesterday's District 02 would have been second, at 75-25). It isn't just liberal, but also a Party district: Hillary beat Bernie by 64-36, also the largest margin in the state).
The only neighborhoods in which Clinton didn't beat Trump by more than 50 points were Manayunk and the northern parts of South Philly; middle income predominately White neighborhoods that share traits with Trump's base nationwide.
The votes are not evenly distributed with the population. Interestingly, the Black neighborhoods in Southwest Philly, Overbrook, and North Philly turned out strong in 2016, voting in numbers similar to their wealthier counterparts in Center City.
Those neighborhoods also fall off less between Presidential and Gubernatorial elections. The neighborhoods that vote for the President but don't for the Governor are the young, new-to-the-city neighborhoods: University City, Manayunk, and Penn's Landing/Northern Liberties. Chestnut Hill, Mount Airy, and Cedarbrook by far do the best in maintaining their voting through Gubernatorial years.
The diversity of the district played out in the 2016 primary. Manayunk, Queens Village, and University City all voted for Hillary. There's a clear racial divide, though not a complete one: all of the Black neighborhoods voted strongly for Hillary, while the White neighborhoods appear to be split, with gentrifying White neighborhoods swinging towards Bernie (thought Hillary still eked those out) and wealthier, longer-White neighborhoods voting decisively for Clinton.
Black residents are the majority of the residents of the district, and they are a similar majority of the voters. They vote the most Democratic (though everyone in the neighborhood votes D at over 87%), and supported Hillary strongly over Bernie. White, non-Hispanic residents are disproportionately more likely to vote, and thus make up more of the electorate than they do of the population, though that's mostly true in Presidential elections, when the younger neighborhoods turn out.
This week, I'm profiling each of the State Supreme Court's new Congressional Districts, looking at their voting behavior and their demographics. Today, the new District 02.
The new District 02 covers Northeast and North Philly; basically the entire city above Race/Spring Garden and East of Broad. This is a diverse swath of the City, combining some of the poorest sections of North Philly with gentrifying Fishtown, with the Trumpy "Middle Neighborhoods" in the Northeast.
Overall, the district is decisively Democratic. It voted 75-25 for Clinton in 2016, and 80-20 for Wolf in 2014. It also voted for Clinton over Sanders in the 2016 primary, 62-38, by more than the state as a whole.
That decisive Democratic 2016 victory was the function of a Democratic sweep of North Philly, combined with basically a even split between Clinton and Trump in the Northeast.
The Democratic map almost perfectly lines up with the racial Demographics. Those Trumpy neighborhoods were also the White neighborhoods (with the only exception of Northern Liberties at the bottom), while the Black and Hispanic neighborhoods voted decisively for Clinton.
The areal maps can mislead about proportionality; North Philly is *much* denser than the Northeast. Those votes carried the day. Notice that even though the Hispanic section around Erie Ave have the population density, their low voting rates mean that their votes per mile is lower than their neighboring Black neighborhoods.
Of course, 2018 is a Gubernatorial election, not a presidential election. Turnout is much lower in these elections, and different people vote. This district sees its votes fall by half, in line with the state overall.
And neighborhoods fall disproportionately too.
Among the Democrats, which neighborhoods are the Berniest? Which are the Hillary-est? While Hillary swept the District, she decisively won in the Black and Hispanic neighborhoods. This matching between Sanders strongholds and Trump strongholds plays out all over the city, and across the country.
Finally, we can cross-tabulate votes by race. We don't have voter-level results by race, so I approximate it by aggregating votes to Census Block Groups, and allocating votes within them. This isn't perfect (see Ecological Fallacy), but block groups are small enough and Philadelphia segregated enough to make this a very good proxy for how different racial groups voted. (Note, these percentages are the two-candidate vote. This mostly effects the Primary results, if Clinton won 45% to Sanders' 40%, she would have won 45 / (45 + 40) = 53% of the two-candidate vote).
White Non-Hispanic residents in the district are slightly over-represented among voters (43% of votes versus 40% of the population), and voted the least Democratic (though still being far from Republican).