The CHCCS socioeconomic analysis characterizes the demographic profiles of elementary school attendance zones using U.S. Census data. This page walks through the methodology step by step.
We’ll cover:
The map on the right shows the CHCCS district boundary and all 11 elementary schools.
Each CHCCS elementary school has an attendance zone — the geographic area from which students are assigned to that school by default. These zones tile the district with no gaps or overlaps.
The colored polygons show the 10 attendance zones from the CHCCS administrative shapefile. (Frank Porter Graham Bilingue is a district-wide magnet school and may not have a distinct zone polygon.)
Demographic data comes from the American Community Survey (ACS) 5-Year Estimates (2020–2024). The Census Bureau publishes this data at the block group level — neighborhood-sized areas of roughly 600–3,000 people.
The map now shows block group boundaries (blue outlines) around the Northside Elementary area. Notice how they don’t line up with school zone boundaries.
Here’s the core problem: Census neighborhoods and school zones don’t share boundaries. The Census Bureau draws block groups based on population size and county lines. The school board draws attendance zones based on enrollment and geography. Neither was designed to match the other.
The red outline shows Northside’s zone boundary cutting across multiple block groups. A single block group may span multiple zones, and a single zone may contain parts of many block groups.
So how do we figure out the demographics of a school zone when the Census data is organized by completely different boundaries? That’s the challenge the next steps address.
This is formally known as the areal interpolation problem, first defined by Goodchild & Lam (1980): how do you transfer data from one set of geographic units to another?
The simplest approach: where a block group is split by a zone boundary, divide the population based on land area. If 40% of a block group’s area falls inside a school zone, that zone gets 40% of the people.
Think of it like spreading peanut butter on toast: the population is smeared uniformly across the entire block group, and you get whatever share of peanut butter matches your share of the bread.
where A_st is the overlap area between source zone s and target zone t, and A_s is the total area of source zone s.
The colored fragments show each zone–block group intersection piece, shaded by its area weight (darker = higher fraction of the block group’s area).
To fix the “peanut butter” problem, we bring in additional information: property boundaries from the Orange County Tax Assessor that show exactly where homes are.
The green polygons show improved residential parcels — properties with homes where people actually live. Notice how they cluster in neighborhoods, leaving commercial areas, parks, and institutional land empty.
We filter to only improved residential properties, excluding vacant lots, commercial buildings, and other non-residential land.
Parcels are selected where is_residential = True and
imp_vac contains "Improved". Residential land-use
codes include 100, 110, 120, 630, and EXH prefixes.
Instead of treating all land equally, we now only count residential land when splitting population between zones. This technique is called dasymetric mapping — it uses extra information (here, where homes are) to place people more accurately.
The fragments are now colored by their residential-land weight. Compare with the previous step: areas with more homes get higher weights, while areas covering commercial or institutional land get lower weights.
where R_st is the residential parcel area within the intersection of source zone s and target zone t, and R_s is the total residential parcel area in source zone s.
An R-tree spatial index on parcel polygons enables efficient intersection queries. For each geometry, candidate parcels are identified via bounding-box lookup, then precisely clipped, and their areas summed.
When a block group has no residential parcels (R_s = 0), the algorithm falls back to plain area weighting. In the CHCCS district, 179 of 184 fragments received dasymetric weights; 5 used the area-weighted fallback.
Wright, J. K. (1936). A method of mapping densities of population. Geographical Review, 26(1), 103–110.
Goodchild, M. F. & Lam, N. S. (1980). Areal interpolation. Geo-Processing, 1, 297–312.
Mennis, J. (2003). Generating surface models of population using dasymetric mapping. The Professional Geographer, 55(1), 31–42.
The ACS provides raw counts (population, households below poverty, renter-occupied units, etc.) at the block group level. We compute 10 derived metrics from these counts:
The map shows block groups colored by % below 185% poverty (the FRL eligibility proxy). Darker red = higher poverty concentration.
C17002 (poverty ratio), B03002 (race/ethnicity), B25003 (tenure), B25044 (vehicles), B11003 (family type), B01001 (age/sex), B19013 (median income)
A critical detail: when aggregating block groups into zones, we sum the counts first, then recompute percentages — we never average percentages directly.
Why? Consider two block groups overlapping a zone:
| Block Group | Population | Below 185% FPL | % Poverty |
|---|---|---|---|
| BG A (weight 0.8) | 2,000 | 600 | 30% |
| BG B (weight 0.3) | 500 | 25 | 5% |
Wrong (averaging the percentages): (30% × 0.8 + 5% × 0.3) / (0.8 + 0.3) = 23.2%
Correct (adding up the people first): (600 × 0.8 + 25 × 0.3) / (2000 × 0.8 + 500 × 0.3) = 487.5 / 1750 = 27.9%
Averaging percentages gives a tiny group of 500 people nearly as much influence as a group of 2,000 — a well-known statistical pitfall that distorts the true picture.
Robinson, W. S. (1950). Ecological correlations and the behavior of individuals. American Sociological Review, 15(3), 351–357.
For the dot-density map, we need finer detail than block groups. The 2020 Decennial Census provides race/ethnicity counts at the block level — much smaller areas, often just a few city blocks.
The map shows Census block boundaries (dashed outlines) within the Northside area. Each block has population counts for 6 race categories.
2020 Census P.L. 94-171 Redistricting Data, tables P1 (race) and P2 (Hispanic origin)
The Decennial Census gives us block-level race data, but the ACS’s socioeconomic data (income, poverty, etc.) is only available at the larger block-group level. To create detailed color-coded maps, we split the block-group data down to individual blocks.
Each small block gets a share of its neighborhood’s data, based on how much residential land it contains. Counts like population are divided proportionally, while median income is copied directly (you can’t split a median).
Each block inherits a fraction of its parent block group’s counts, weighted by residential parcel area. Percentages are recomputed from the downscaled counts.
The map shows blocks colored by downscaled % below 185% poverty.
The dot-density layer represents every person in the 2020 Census as a single colored dot, placed randomly within their Census block. Six race/ethnicity categories are shown:
At 1:1 ratio, the full district map contains ~95,000 dots. This page shows only the Northside area to keep the file manageable.
Instead of scattering dots randomly across entire Census blocks — including roads, parking lots, and commercial buildings — we place dots only on residential land. Each dot is constrained to the overlap between its Census block and nearby residential parcels.
Compare the dots with the green parcel outlines: dots cluster on residential land, not on roads or institutional areas.
For each block, a placement geometry is computed as
block ∩ union(nearby_residential_parcels). If the
intersection area is < 10 m², the full block is used as
fallback. Random points are generated via shapely.random_points()
with a fixed seed (rng = np.random.default_rng(42))
for reproducibility.
The final step: add up all the weighted pieces to get a demographic profile for each school’s attendance zone. Counts are summed first, then percentages are calculated from those totals.
The map shows attendance zones colored by % below 185% poverty (the FRL eligibility proxy). Click any zone to see its full demographic profile.
The map shows MLS home sale records (2023–2025) across the district, color-coded by price quartile. Each dot is a closed sale.
Home sale data adds a market-level perspective to Census demographics. The production map lets users switch between zone definitions and see how sale counts, median prices, and price-per-square-foot shift when boundaries are redrawn by proximity rather than official assignment.
The map shows affordable housing locations across the district, color-coded by AMI (Area Median Income) band. Each dot is a subsidized unit.
Affordable housing is unevenly distributed across the district. The production map aggregates unit counts per zone, allowing comparison of how many subsidized units fall within each school’s catchment area under different zone definitions.
The map can overlay MLS home sale records from the Triangle MLS (2023–2025). Each dot represents a closed sale, color-coded by price quartile:
Addresses are geocoded using the Census Bureau batch geocoder (primary) with Nominatim fallback, then clipped to the CHCCS district boundary. Three metrics are computed per zone:
The map can overlay planned residential developments from the Town of Chapel Hill’s Active Development page (hand-transcribed March 12, 2026). Each circle represents one development, colored by expected unit count:
Developments are geocoded using the Census Bureau batch geocoder (primary) with Nominatim fallback, then clipped to the CHCCS district boundary. Two metrics are computed per zone:
A supplementary dataset from the SAPFOTAC 2025 Annual Report (Student Attendance Projections and Facility Optimization Technical Advisory Committee, certified June 3, 2025) provides 21 future residential projects with projected student yields — estimates of elementary, middle, and high school students each development will generate.
The same blue–to–red color scheme applies, scaled by remaining housing units. Click a marker for per-project detail including student yield breakdowns.
The bar charts show % below 185% poverty for each school under two different zone definitions — computed from the same underlying Census data using the same dasymetric methodology.
School Zones use the official CHCCS attendance boundaries. Nearest Drive assigns each location to whichever school is closest by driving distance (Dijkstra shortest-path on the OpenStreetMap road network).
The differences illustrate how boundary definitions affect school-level demographic composition. A school whose official zone extends into a lower-income area may show a very different poverty rate when zone boundaries are redrawn by proximity.
The full production map lets users switch between 5 zone types interactively, with bar charts that update in real time.
The socioeconomic analysis has 26 documented limitations. Key ones:
The complete list of 26 limitations with references and validation results is available in the project documentation.