d

Methods

How do we calculate place inequality?

  1. Data

    Our anonymous data contains billions of visits to locations in the Boston Metro Area from October 2016 to March 2017. We analyze mobility patterns of about 150,000 anonymous users, provided through Cuebiq’s Data for Good initiative, in the whole Boston Metro area (around 3% of the population). MIT Media Lab coupled Cuebiq's anonymous location data with census information to determine user income groups and identify user visitation patterns to places. We use the census-based core-based statistical areas to define metro areas. This definition encompasses the cities, suburbs, and commuting zones that are economically tied to a central city of interest. In the Boston case, it contains large areas in New Hampshire and down the Massachusetts coast. Finally, we have used the public Foursquare API to get the list of the 30,000 places that users visit.

  2. Estimating home and income

    Using the same anonymous data, we determine users’ home location at the level of Census Block Groups, a statistical subarea of census tracts that each contain between 600 and 3,000 people. To determine a user's home census block group, we compute the most common block group that a user spends time in between 8:00 pm and 4:00 am. We then assign each user an income using the median household income of their "home" census block group from the 5-year American Community Survey for 2012-2016. Users were then clustered in 4 different quantile groups of annual income: low ($10,000 - $67,000), medium ($67,000 - $90,000), upper medium ($90,000 - $114,000), and high (over $114,000). Each income group contains exactly 1/4 of the users in our dataset.

  3. Detecting where people visit

    Using location data as described above, we determine when users stop for more than 5 minutes in a particular place. If a user stops for more than 5 minutes within 20 meters from a place, we treat them as having visited that place.

    We consider a user as having visited a place if they stop within 20 meters of its location for more than 5 minutes.

    We consider a user as having visited a place if they stop within 20 meters of its location for more than 5 minutes

  4. Measuring inequality in a place

    To measure inequality in a place, we first compute the total time each user spends in a venue, and then sort all users by their income group. Then, we compute the total time each income group spends in each place in our dataset. The end result is a chart for each place that shows how likely you are to see people from each income quartile if you visit.

    To measure the inequality of a place, we use a city-wide normalized measure. Because income groups correspond to exactly 25% of the users in the area, a totally equal place would be visited by each income group 25% of the time. This corresponds to a flat chart, like the one on the left. We consider any deviation from that flat profile to contribute to inequality. For example, a totally segregated place would show a chart more like the one on the right, with 100% of the users coming from a single income group. Mathematically, we calculate the absolute deviation of a place's income distribution from that flat ideal:

    and normalize it to be between 0% (flat distribution) to 100% (only one group is present).

  5. Limitations

    There are several limitations with our approach that could mean our results are incomplete. First, we select users based on having consistent home locations during the 6 months of our dataset. This excludes smartphone users who are homeless, in transition, or those users who work non-normative work shifts (between 8pm and 4am). Second, the venues we consider are limited to those available via the Foursquare API. Third, our approach to determing whether a user visits a place does not account for GPS noise, and so some of our records of users visiting places could be wrong. See the FAQ for more about data representativeness.

    Additionally, it is important to note that the Atlas records income inequality, not class inequality. The lowest income quantile we use encompasses a huge range of both income and socioeconomic difference, covering neighborhoods with median household incomes between $10,000 and $67,000 per year. The qualitative class differences (lifestyle, opportunity, etc) between a household with $20,000 of annual income and $60,000 of annual income are extremely large in the Boston area. Income is also known to be an imperfect proxy for class; many highly wealthy households have little annual income.

Stories // Across the street

Places with economically diverse visitors can be only a few feet away from places where only the wealthiest spend their time. The way we experience economic inequality in cities is impacted by where we go, not just by where we live.

When you want to grab a cup of coffee, where do you go? Does it depend on whether you’re at home or at work? How many other coffee shops do you pass on the way to your favorite spot?

We all have our own preferences and needs -- like choosing a latte or a simple cup of black coffee. When we head out into our cities and towns, we use our preferences to decide where to spend our time. Through these choices, we build our habits and routines.

The fact that our routines decide where we spend our time also means that they decide who we spend our time with. Although we may get our daily coffee on the same city block, the specific places we're in -- and the people around us -- can be radically different.

Map of two coffeehouses and different incomes

In Boston, like in other cities, coffee shops can occupy places as close as a few feet apart. The two coffee in the figure above are on the same city block. Although they are within 2 minute walk of one another, the incomes of the people who visit them are very different.

Coffeehouse 2 is visited almost exclusively by people we’ve identified as being in Boston’s lowest income bracket, while Coffeehouse 1 has visitors more representative of all incomes. The choice of which coffee shop to visit can contribute significantly to income segregation.

It’s important to note that these choices that people make are usually constrained by things like affordability, location, and social groups. But a place where people of all different incomes come together can be on the same block as a place where only the poorest -- or the wealthiest -- spend their time. This means that socio-economic inequality in cities is encoded in part by these choices, not just where people live.

Research

Papers

  • The segregated places of United States cities

    E. Moro, D. Calacci, X. Dong, and A. Pentland, forthcoming (2019).

Talks

  • Overcoming urban isolation

    Esteban Moro at TEDxCambridge

  • Cites at high resolution (in spanish)

    Esteban Moro at BBVA Foundation, Demography today

FAQ

Places

  • How are the places selected?

    The places are extracted using the Foursquare API. We have only considered places that are verified (claimed by the owners) or that have at least 5 check-ins within six months.To preserve the anonymity of users, we only consider places where we detected at least 20 different users within the six-month period.

  • There is a place I know that isn't included, why?

    This could happen for many reasons: it might not be included in the public Foursquare API, or we might have discarded it for privacy or statistical reasons. If you know the actual location of the place, we can possibly include it! Please send us an email specifying the name of the place, its location, and any other details for us to add it to the atlas.

  • Is your data representative?

    In general, yes. To test if our sample of users is representative, we’ve run several analyses comparing the incomes of people in our dataset to census reports, and the average income of our user sample is only 8.6% higher than the census data. To test if our method of detecting when a person spends time in a place is accurate, we use our data to estimate attendance at professional sports games (NFL, NBA, and the NHL). Estimates of attendance computed using our data are extremely close to official attendance counts, which suggests that the data we use here can describe visits to a specific place quite well (generally within a few thousand of the official attendance counts).

Inequality

  • What do you mean when you say a place is “65% unequal”?

    A place is unequal if it is not visited by the four different income quartiles we've identified in the city. A high value of inequality in a place means that most of its visitors come from one or two of these income groups. You can find more information about how the inequality score is computed in the Methods section.

  • I visit place X a lot, and know that X is more/less unequal than you reported, why?

    We determine that a user has spent time in a place when our data shows that they’ve spent more than five minutes within 20 meters of it. We’ve tried our best to be careful in assigning user locations to places, but our method isn’t perfect. When places are very close to each other (less than 20 meters), or where places are on different floors of the same building, our data is a little more noisy. Please let us know about the specific place in question by sending us an email, and we will investigate it.

Privacy

  • Where did you obtain the data about visits at locations? Is this legal?

    The anonymous location data come from collaboration with Cuebiq through their Data for Good program. Cuebiq is a location intelligence company that collects data through its proprietary software-development-kit technology integrated in more than 200 mobile apps, reaching a diverse user base of millions of anonymous users who have opted in to share location data. Cuebiq is fully compliant with the GDPR in the EU; its privacy-compliant methodology is at the forefront of industry standards and has earned the company membership status with the Network Advertising Initiative (NAI), the leading industry association dedicated to responsible data collection and its use for digital advertising.

  • Are you singling out individuals?

    No. We are not interested in individual behavior and are not using individual behavior in our analysis. We anonymize our data before analysis by aggregating by neighborhoods, and we are only interested in aggregated trends about the economic diversity of locations visited. Our final data consists of aggregations (counts) at the level of places, not individuals. To ensure privacy and help prevent de-anonymization and re-identification, we only consider places in which we detected at least 20 different users within the six-month period.

About

The Atlas of Inequality is a project from the Human Dynamics group at the MIT Media Lab and the Department of Mathematics at Universidad Carlos III de Madrid.

It is part of a broader initiative to understand human behavior in our cities and how large-scale problems like transportation, housing, segregation or inequality depend in part on the emergent patterns of people’s individual opportunities and choices.

Comments, questions? Contact us by email

Principal Investigators:

  • Esteban Moro
    @estebanmoro
    MIT + UC3M

  • Alex “Sandy” Pentland
    MIT

Researchers:

  • Dan Calacci
    @dcalacci
    MIT

  • Xiaowen Dong
    MIT + Oxford

Acknowledgments:

With thanks to:

  • cuebiq

    Cuebiq is a location intelligence and consumer insights company that maps and measures the offline consumer journey. Cuebiq's platform analyzes anonymous location patterns to allow researchers to glean actionable insights on human mobility. Through its Data for Good Program, Cuebiq provides access to location-based intelligence for academic research and humanitarian initiatives related to urban development, economic mobility, transportation, disaster response, and epidemiology, among other use cases.

  • CARTO

    From smartphones to connected cars, location data is changing the way we live and the way business happens. CARTO is the platform that turns location data into more efficient delivery routes, better behavioural marketing, strategic store placements, and much more. Business analysts, developers and data scientists use CARTO's software and data to understand where and why things happen, optimize business processes, and predict future outcomes.

Inequality index

Very Equal Very Unequal
A

Browse Stories we’ve made from data we’ve curated, or explore the Map for yourself.

Learn more about the data, its privacy implications, and how we made the maps

B

Search:You can search for a particular address here.

C

Inequality distribution:Distribution of place inequality in the current view. Select a range to show only places with certain level of inequality.

D

Point style:Toggle between places colored by the segregation index or by place category (Shop & Service, Food, etc.)

E

Income Filter:View only places that are visited mostly by only one income group.

F

Place category:List with the number of places by category in the current view. Click on a category (or several) to select only places of that (or those) categories.