How the Demographics Map Works

1

The CHCCS socioeconomic analysis characterizes the demographic profiles of elementary school attendance zones using U.S. Census data. This page walks through the methodology step by step.

We’ll cover:

How Census data is mapped to school zones
Why using residential land improves accuracy
How the dot-density map is generated
Key limitations you should know about

The map on the right shows the CHCCS district boundary and all 11 elementary schools.

Data: NCES EDGE School Locations 2023-24 (LEAID 3700720) • Census TIGER/Line Unified School Districts 2024

2

Schools & Attendance Zones

Each CHCCS elementary school has an attendance zone — the geographic area from which students are assigned to that school by default. These zones tile the district with no gaps or overlaps.

The colored polygons show the 10 attendance zones from the CHCCS administrative shapefile. (Frank Porter Graham Bilingue is a district-wide magnet school and may not have a distinct zone polygon.)

Data: CHCCS elementary attendance zone shapefile, dissolved by ENAME field.

Limitation: Attendance zones define default assignment. Actual enrollment differs due to transfers, charter/private school attendance, and the FPG magnet program.

3

Census Block Groups

Demographic data comes from the American Community Survey (ACS) 5-Year Estimates (2020–2024). The Census Bureau publishes this data at the block group level — neighborhood-sized areas of roughly 600–3,000 people.

The map now shows block group boundaries (blue outlines) around the Northside Elementary area. Notice how they don’t line up with school zone boundaries.

Data: U.S. Census Bureau, ACS 5-Year 2020–2024 • TIGER/Line 2024 block group geometries (Orange County, NC)

4

The Mismatch: Zones vs. Block Groups

Here’s the core problem: Census neighborhoods and school zones don’t share boundaries. The Census Bureau draws block groups based on population size and county lines. The school board draws attendance zones based on enrollment and geography. Neither was designed to match the other.

The red outline shows Northside’s zone boundary cutting across multiple block groups. A single block group may span multiple zones, and a single zone may contain parts of many block groups.

So how do we figure out the demographics of a school zone when the Census data is organized by completely different boundaries? That’s the challenge the next steps address.

Technical background

This is formally known as the areal interpolation problem, first defined by Goodchild & Lam (1980): how do you transfer data from one set of geographic units to another?

5

Simple Area Weighting

The simplest approach: where a block group is split by a zone boundary, divide the population based on land area. If 40% of a block group’s area falls inside a school zone, that zone gets 40% of the people.

Think of it like spreading peanut butter on toast: the population is smeared uniformly across the entire block group, and you get whatever share of peanut butter matches your share of the bread.

See the formula

Ŷ_t = Σ_s (A_st / A_s) × Y_s

where A_st is the overlap area between source zone s and target zone t, and A_s is the total area of source zone s.

The colored fragments show each zone–block group intersection piece, shaded by its area weight (darker = higher fraction of the block group’s area).

Problem: People don’t live in parking lots, parks, or commercial buildings. A block group that is half apartments and half shopping mall will have its population spread across both, even though 100% of residents live in the apartment half.

6

Residential Parcels

To fix the “peanut butter” problem, we bring in additional information: property boundaries from the Orange County Tax Assessor that show exactly where homes are.

The green polygons show improved residential parcels — properties with homes where people actually live. Notice how they cluster in neighborhoods, leaving commercial areas, parks, and institutional land empty.

We filter to only improved residential properties, excluding vacant lots, commercial buildings, and other non-residential land.

Land-use filter details

Parcels are selected where is_residential = True and imp_vac contains "Improved". Residential land-use codes include 100, 110, 120, 630, and EXH prefixes.

Data: Orange County, NC Tax Assessor parcel data • Residential land-use codes: 100, 110, 120, 630, EXH prefixes

7

Dasymetric Weighting

Instead of treating all land equally, we now only count residential land when splitting population between zones. This technique is called dasymetric mapping — it uses extra information (here, where homes are) to place people more accurately.

The fragments are now colored by their residential-land weight. Compare with the previous step: areas with more homes get higher weights, while areas covering commercial or institutional land get lower weights.

See the formula

Ŷ_t = Σ_s (R_st / R_s) × Y_s

where R_st is the residential parcel area within the intersection of source zone s and target zone t, and R_s is the total residential parcel area in source zone s.

Technical details

An R-tree spatial index on parcel polygons enables efficient intersection queries. For each geometry, candidate parcels are identified via bounding-box lookup, then precisely clipped, and their areas summed.

When a block group has no residential parcels (R_s = 0), the algorithm falls back to plain area weighting. In the CHCCS district, 179 of 184 fragments received dasymetric weights; 5 used the area-weighted fallback.

References

Wright, J. K. (1936). A method of mapping densities of population. Geographical Review, 26(1), 103–110.

Goodchild, M. F. & Lam, N. S. (1980). Areal interpolation. Geo-Processing, 1, 297–312.

Mennis, J. (2003). Generating surface models of population using dasymetric mapping. The Professional Geographer, 55(1), 31–42.

8

From Counts to Metrics

The ACS provides raw counts (population, households below poverty, renter-occupied units, etc.) at the block group level. We compute 10 derived metrics from these counts:

% Below 185% poverty — a proxy for Free/Reduced-price Lunch (FRL) eligibility
% Minority (non-White, non-Hispanic)
% Black • % Hispanic
% Renter-occupied housing
% Zero-vehicle households
% Single-parent families
% Elementary-age (5–9) • % Young children (0–4)
Median household income

The map shows block groups colored by % below 185% poverty (the FRL eligibility proxy). Darker red = higher poverty concentration.

ACS table codes

C17002 (poverty ratio), B03002 (race/ethnicity), B25003 (tenure), B25044 (vehicles), B11003 (family type), B01001 (age/sex), B19013 (median income)

Data: ACS 5-Year 2020–2024

9

Why Recompute, Not Average?

A critical detail: when aggregating block groups into zones, we sum the counts first, then recompute percentages — we never average percentages directly.

Why? Consider two block groups overlapping a zone:

Block Group	Population	Below 185% FPL	% Poverty
BG A (weight 0.8)	2,000	600	30%
BG B (weight 0.3)	500	25	5%

Wrong (averaging the percentages): (30% × 0.8 + 5% × 0.3) / (0.8 + 0.3) = 23.2%

Correct (adding up the people first): (600 × 0.8 + 25 × 0.3) / (2000 × 0.8 + 500 × 0.3) = 487.5 / 1750 = 27.9%

Averaging percentages gives a tiny group of 500 people nearly as much influence as a group of 2,000 — a well-known statistical pitfall that distorts the true picture.

Reference

Robinson, W. S. (1950). Ecological correlations and the behavior of individuals. American Sociological Review, 15(3), 351–357.

10

Block-Level Race Data (Decennial)

For the dot-density map, we need finer detail than block groups. The 2020 Decennial Census provides race/ethnicity counts at the block level — much smaller areas, often just a few city blocks.

The map shows Census block boundaries (dashed outlines) within the Northside area. Each block has population counts for 6 race categories.

Census table codes

2020 Census P.L. 94-171 Redistricting Data, tables P1 (race) and P2 (Hispanic origin)

Data: 2020 Decennial Census, block-level race data • TIGER/Line 2020 block geometries

Limitation: To protect people’s privacy, the Census Bureau adds small random changes to block-level counts (a technique called “differential privacy”). A block with 3 Asian residents might be published as 0 or 7. The overall geographic patterns remain reliable despite this noise in individual blocks.

11

Downscaling ACS to Blocks

The Decennial Census gives us block-level race data, but the ACS’s socioeconomic data (income, poverty, etc.) is only available at the larger block-group level. To create detailed color-coded maps, we split the block-group data down to individual blocks.

Each small block gets a share of its neighborhood’s data, based on how much residential land it contains. Counts like population are divided proportionally, while median income is copied directly (you can’t split a median).

See the formula

block_count = bg_count × (block_res_area / bg_res_area)

Each block inherits a fraction of its parent block group’s counts, weighted by residential parcel area. Percentages are recomputed from the downscaled counts.

The map shows blocks colored by downscaled % below 185% poverty.

Limitation: Downscaling adds another layer of estimation uncertainty. Block-level poverty values are model outputs, not direct observations.

12

Dot-Density Map: One Dot Per Person

The dot-density layer represents every person in the 2020 Census as a single colored dot, placed randomly within their Census block. Six race/ethnicity categories are shown:

White

Black

Hispanic/Latino

Asian

Multiracial

Native Am./Other

At 1:1 ratio, the full district map contains ~95,000 dots. This page shows only the Northside area to keep the file manageable.

Data: 2020 Decennial Census, P.L. 94-171 block-level race data • Color scheme from censusdots.com

13

Constraining Dots to Parcels

Instead of scattering dots randomly across entire Census blocks — including roads, parking lots, and commercial buildings — we place dots only on residential land. Each dot is constrained to the overlap between its Census block and nearby residential parcels.

Compare the dots with the green parcel outlines: dots cluster on residential land, not on roads or institutional areas.

Algorithm details

For each block, a placement geometry is computed as block ∩ union(nearby_residential_parcels). If the intersection area is < 10 m², the full block is used as fallback. Random points are generated via shapely.random_points() with a fixed seed (rng = np.random.default_rng(42)) for reproducibility.

Remaining limitation: Within residential parcels, dots are placed uniformly. A 200-unit apartment complex and a 1-acre single-family lot receive the same dot density per unit area, despite very different population densities.

14

Zone Aggregation

The final step: add up all the weighted pieces to get a demographic profile for each school’s attendance zone. Counts are summed first, then percentages are calculated from those totals.

The map shows attendance zones colored by % below 185% poverty (the FRL eligibility proxy). Click any zone to see its full demographic profile.

Key finding: The residential-land weighting produced meaningful shifts. For example, one Title I school zone shifted from 19.2% to 23.5% poverty (+4.3 percentage points) after weighting by residential land, moving closer to the known 30–36% FRL rate.

15

MLS Home Sales

The map shows MLS home sale records (2023–2025) across the district, color-coded by price quartile. Each dot is a closed sale.

Home sale data adds a market-level perspective to Census demographics. The production map lets users switch between zone definitions and see how sale counts, median prices, and price-per-square-foot shift when boundaries are redrawn by proximity rather than official assignment.

Data: Triangle MLS, closed sales 2023–2025 within CHCCS district

16

Affordable Housing

The map shows affordable housing locations across the district, color-coded by AMI (Area Median Income) band. Each dot is a subsidized unit.

Affordable housing is unevenly distributed across the district. The production map aggregates unit counts per zone, allowing comparison of how many subsidized units fall within each school’s catchment area under different zone definitions.

Data: Town of Chapel Hill ArcGIS (2025)

17

MLS Home Sales

The map can overlay MLS home sale records from the Triangle MLS (2023–2025). Each dot represents a closed sale, color-coded by price quartile:

Bottom 25%

25th–50th pctl

50th–75th pctl

Top 25%

Addresses are geocoded using the Census Bureau batch geocoder (primary) with Nominatim fallback, then clipped to the CHCCS district boundary. Three metrics are computed per zone:

Homes Sold — total count of closed sales
Median Home Price — median close price
Median Price/SqFt — median close price per square foot

Limitations

MLS data covers only listed sales — FSBO and off-market transactions are excluded.
Geocoding accuracy varies; some addresses may map to approximate locations.
Blocks with few sales may show volatile medians (small sample size).
Sales span three years (2023–2025) and do not reflect single-point-in-time pricing.

Data: Triangle MLS, closed sales 2023–2025 within CHCCS district

18

Planned Developments (CH Active Dev)

The map can overlay planned residential developments from the Town of Chapel Hill’s Active Development page (hand-transcribed March 12, 2026). Each circle represents one development, colored by expected unit count:

400+ units

150–400 units

50–150 units

<50 units

Developments are geocoded using the Census Bureau batch geocoder (primary) with Nominatim fallback, then clipped to the CHCCS district boundary. Two metrics are computed per zone:

Total Expected Units — sum of expected units across all developments in the zone
Number of Developments — count of projects in the zone

Limitations

Covers Chapel Hill only — no Carrboro or unincorporated Orange County projects.
Projects at various approval stages — some may not proceed or may change scope.
Unit counts are estimates from planning documents, not final construction figures.
Geocoding is approximate (road-segment interpolation, not exact site boundaries).

Data: Town of Chapel Hill Active Development page, hand-transcribed March 12, 2026

19

Planned Developments (SAPFOTAC)

A supplementary dataset from the SAPFOTAC 2025 Annual Report (Student Attendance Projections and Facility Optimization Technical Advisory Committee, certified June 3, 2025) provides 21 future residential projects with projected student yields — estimates of elementary, middle, and high school students each development will generate.

The same blue–to–red color scheme applies, scaled by remaining housing units. Click a marker for per-project detail including student yield breakdowns.

How it differs from CH Active Dev

Adds student yield projections (elementary, middle, high) — not available in CH Active Dev.
Covers Chapel Hill + Carrboro (e.g., Jade Creek, Newbury).
Different vintage — certified June 2025 vs. March 2026 for CH Active Dev.
Some overlap — projects like Gateway, South Creek, and Aura Chapel Hill appear in both sources. The datasets are not deduplicated.

Limitations

Student yields are model estimates based on generation rates, not actual enrollment.
Geocoding is approximate (same Census + Nominatim pipeline).
Some projects fall outside the CHCCS district boundary and are not assigned to any zone in bar-chart aggregation.

Data: SAPFOTAC 2025 Annual Report, certified June 3, 2025

20

Zone Definitions Matter

The bar charts show % below 185% poverty for each school under two different zone definitions — computed from the same underlying Census data using the same dasymetric methodology.

School Zones use the official CHCCS attendance boundaries. Nearest Drive assigns each location to whichever school is closest by driving distance (Dijkstra shortest-path on the OpenStreetMap road network).

The differences illustrate how boundary definitions affect school-level demographic composition. A school whose official zone extends into a lower-income area may show a very different poverty rate when zone boundaries are redrawn by proximity.

The full production map lets users switch between 5 zone types interactively, with bar charts that update in real time.

21

Limitations & Caveats

The socioeconomic analysis has 26 documented limitations. Key ones:

Data Quality

ACS margins of error are not tracked or displayed. For small block groups, confidence intervals can be wide.
5-year rolling average masks recent demographic shifts (new housing, gentrification).
Privacy protections in Census data can distort small race counts at the block level.

Geographic Alignment

Zone ≠ enrollment: Zone demographics describe residents, not enrolled students. There is a known gap of roughly 10 percentage points between zone poverty estimates and actual school FRL rates.
Temporal mismatch: ACS (2020–2024), Decennial (2020), parcels (current), zones (current) use different vintages.
Edge distortion: Block groups at district edges extend into neighboring districts; clipping creates slivers with potentially unrepresentative demographics.

Methodological

Median income approximation: Weighted average of block group medians ≠ true zone median.
Uniform dot placement within parcels ignores density variation (apartments vs. single-family).
No ACS margins of error propagated through interpolation.

Full documentation

The complete list of 26 limitations with references and validation results is available in the project documentation.

Bottom line: The analysis provides useful relative comparisons between zones (which zones have higher/lower poverty, more/less diversity) but the absolute numbers carry meaningful uncertainty. Treat all values as estimates, not precise counts.