About

Data & methodology

Every figure on Gazetteer resolves to a source within one click. The list below is generated automatically from content/sources.json — attribution strings and licence terms are not hardcoded anywhere else in the codebase.

Indigenous Data Sovereignty: Gazetteer treats Indigenous Census variables as outcomes of colonisation, not overlays of difference. Where a number describes a community, the framing credits that community. ABS Census 2021 was released under CC BY 4.0 for public use; the responsibility for respectful framing is editorial, not legal.

Sources

ABS ASGS Edition 3 (2021) ABS Creative Commons Attribution 4.0 International (CC BY 4.0)
Full name
Australian Bureau of Statistics — Australian Statistical Geography Standard (ASGS) Edition 3, 2021
Home
https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3 live ↗
Retrieval
Bulk shapefile / GeoPackage download for SA2 + SA3 boundaries (and SAL for label rendering). Convert to GeoJSON, then to PMTiles via tippecanoe with a separate source-layer per level for serving from Cloudflare R2. SA2 paints above zoom 6, SA3 below.
Licence
Creative Commons Attribution 4.0 International (CC BY 4.0) live ↗
Coverage
All Australia. SA1 / SA2 / SA3 / SA4 / GCCSA / STE / LGA / SAL / POA boundaries. Gazetteer ships SA2 (~2,310 polygons, primary unit) and SA3 (~358 polygons, zoomed-out smoothed view). SAL labels are used for human-readable suburb names.
Refresh
Updated with each Census cycle (5-yearly). Edition 3 is the current standard; Edition 4 expected post-2026 Census.
Attribution
Boundary data: Australian Bureau of Statistics, Australian Statistical Geography Standard (ASGS) Edition 3, 2021. Licensed under CC BY 4.0.

Caveats

  • SA2 boundaries are designed for statistical purposes, not administrative — they don't always align with how locals perceive 'suburb' boundaries.
  • SAL (Suburb and Locality) layer provides the human-readable suburb names but doesn't tile cleanly to SA2 — gazetteer uses SAL only for labelling, SA2 for data joins.
  • Coastline polygons include very large SA2s for sparsely populated regions — visual choropleth needs careful classification (quantile or natural breaks) to avoid the map being dominated by empty outback SA2s.
  • SA3 nests every SA2 sharing the same first five digits of SA2_CODE21. Zooming out to SA3 smooths sparse-population outliers — useful for the national overview where individual outback SA2s would otherwise dominate the choropleth.

References

ABS Census 2021 ABS Creative Commons Attribution 4.0 International (CC BY 4.0)
Full name
Australian Bureau of Statistics — Census of Population and Housing 2021
Home
https://www.abs.gov.au/census/find-census-data/datapacks live ↗
Alt
https://data.abs.gov.au/ live ↗
Retrieval
Primary: DataPacks CSV download (General Community Profile + Aboriginal and Torres Strait Islander Peoples Profile) at SA2 geography. Fallback: ABS Data API at data.abs.gov.au for ad-hoc queries.
Licence
Creative Commons Attribution 4.0 International (CC BY 4.0) live ↗
Coverage
All Australia. Available at SA1 / SA2 / SA3 / SA4 / LGA / SAL (Suburb and Locality) / POA (Postal Area). Gazetteer ingests at SA2.
Geography standard
ASGS Edition 3 (Australian Statistical Geography Standard, 2021).
Geography units
~2,310 SA2s nationally; median population ~10,000.
Refresh
5-yearly (Census). 2021 is the current edition; next release expected ~2027–2028 following the 2026 Census.
Attribution
Source: Australian Bureau of Statistics, Census of Population and Housing 2021. Licensed under CC BY 4.0.

Caveats

  • Small-count suppression: ABS replaces counts <3 with zeros / randomised perturbation to protect privacy.
  • Population floor: SA2s with population <200 are excluded from gazetteer rankings to reduce noise.
  • 'Not stated' responses inflate denominators for percentage fields — affects unemployment rate, % no internet, % born overseas, etc.
  • Census cadence means refresh is essentially one-shot until the 2026 Census release lands.
  • Indigenous status undercount is documented by ABS — figures are 'persons who identified as Indigenous in the Census', not a comprehensive count.
  • Some SA2 boundaries shifted between 2016 and 2021 ASGS editions — gazetteer uses 2021 boundaries throughout for consistency.

References

Fields

Each field is collapsed by default — click a row to see the method, the exact DataPack cell selection, and any derivation notes.

Total population population (persons)
Median age median_age (years)

Method: G02 — Selected Medians and Averages live ↗

Cells: Table G02, Column Median_age_persons archive snapshot ↗

Median household income (weekly) median_household_income_weekly (AUD/week)

Method: G02 — Selected Medians and Averages live ↗

Cells: Table G02, Column Median_tot_hhd_inc_weekly archive snapshot ↗

Equivalised total household income excluded for v0; raw weekly median used as the visceral number.

Median rent (weekly) median_rent_weekly (AUD/week)

Method: G02 — Selected Medians and Averages live ↗

Cells: Table G02, Column Median_rent_weekly archive snapshot ↗

Unemployment rate unemployment_rate (percent)

Method: G43 — Labour Force Status by Age by Sex live ↗

Cells: Table G43, Column Percent_Unem_loyment_P archive snapshot ↗

Derivation: ABS's own Percent_Unem_loyment_P column from G43 (unemployed / labour force). Excludes 'not in labour force' by construction.

Households renting pct_renting (percent)

Method: G37 — Tenure Type and Landlord Type by Dwelling Structure live ↗

Cells: Table G37, Numerator R_Tot_Total, Denominator Total_Total archive snapshot ↗

Derivation: R_Tot_Total (all rental subcategories summed by ABS) / Total_Total (all tenure types including 'not stated').

Average persons per bedroom avg_persons_per_bedroom (persons/bedroom)

Method: G02 — Selected Medians and Averages live ↗

Cells: Table G02, Column Average_num_psns_per_bedroom archive snapshot ↗

v0 overcrowding proxy. ABS-published median; values >1.0 indicate more persons than bedrooms on average. Will be superseded by overcrowding_rate (G32 / CNOS) once a dedicated resolver lands.

Overcrowding rate (Canadian National Occupancy Standard) overcrowding_rate (percent)

Method: G32 — Number of Bedrooms by Number of Persons Usually Resident live ↗

Pending: G32 is a bedrooms × usual-residents cross-tab; CNOS derivation requires joint rules across cells, not a single column pair. Needs a dedicated resolver. Until then, avg_persons_per_bedroom is the active overcrowding metric.

Derivation: Households requiring one or more additional bedrooms per CNOS / total occupied private dwellings.

Dwellings with no internet connection pct_no_internet (percent)

Pending: Variable dropped from 2021 Census — no equivalent field in the 2021 GCP DataPack. Last reported in 2016.

Dwellings with no motor vehicle pct_no_car (percent)

Method: G34 — Number of Motor Vehicles by Dwellings live ↗

Cells: Table G34, Numerator Num_MVs_per_dweling_0_MVs, Denominator Total_dwelings archive snapshot ↗

Single-parent families (share of all families) pct_single_parent_families (percent)

Method: G29 — Family Composition live ↗

Cells: Table G29, Numerator OPF_Total_P, Denominator Total_P archive snapshot ↗

Derivation: OPF_Total_P (one-parent families, all persons) / Total_P (all families). Family-composition cells are subject to ABS small-count perturbation; shares in low-population SA2s (under the 200-person ranking floor) are noisy.

Adults (15+) with no post-school qualification pct_no_post_school_qualification (percent)

Method: G43 — Selected Labour Force, Education and Migration Characteristics by Sex live ↗

Cells: Table G43, Denominator P_15_yrs_over_P archive snapshot ↗

Derivation: (P_15_yrs_over_P minus the sum of the non_sch_qual level bands: PostGrad_Dgre_P, Gr_Dip_Gr_Crt_P, Bchelr_Degree_P, Advnd_Dip_Dip_P, CertTot_Level_P) / P_15_yrs_over_P. CertTot is the certificate roll-up (Cert I-IV plus level-not-further-defined), so certificate holders are counted once. The G43 summary carries no 'not stated' split for qualifications, so not-stated and inadequately-described responses fall into the no-qualification residual — the figure runs slightly high relative to a G46-based derivation. Residuals below zero (random-rounding artefacts in tiny SA2s) are clipped to 0.

Aboriginal and/or Torres Strait Islander population pct_indigenous (percent)

Method: G07 — Indigenous Status by Age by Sex live ↗

Cells: Table G07, Numerator Tot_Indigenous_P, Denominator Tot_Tot_P archive snapshot ↗

Cross-checked against the Aboriginal and Torres Strait Islander Peoples Profile (IP01) where available. Framing per Indigenous Data Sovereignty principle: outcomes-of-colonisation, not 'Indigenous presence as overlay'.

Persons born overseas pct_born_overseas (percent)

Method: G01 — Selected Person Characteristics by Sex (Birthplace aggregate) live ↗

Cells: Table G01, Numerator Birthplace_Elsewhere_P, Denominator Tot_P_P archive snapshot ↗

Derivation: Birthplace_Elsewhere_P / Tot_P_P. Sourced from G01 rather than G09 (which splits countries across files A–G); G01 carries the Australia/Elsewhere aggregate directly.

Top 3 languages used at home (other than English) languages_top_3 (language names + counts)

Method: G13 — Language Used at Home by Sex live ↗

Derivation: Top 3 by speaker count, excluding 'English only', 'not stated', and 'inadequately described'.

ABS Mesh Block Allocation + 2021 Census MB Counts ABS Creative Commons Attribution 4.0 International (CC BY 4.0)
Full name
Australian Bureau of Statistics — ASGS Edition 3 (2021) Mesh Block allocation file joined with 2021 Census Mesh Block Counts
Home
https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/allocation-files live ↗
Retrieval
Two-source xlsx download joined on MB_CODE_2021. (1) Allocation file (MB_2021_AUST.xlsx) from the ASGS Edition 3 allocation-files page — single sheet of MB → SA1/SA2/SA3/SA4/GCCSA/STE hierarchy codes plus MB_CATEGORY_2021 (Residential, Commercial, Parkland, Primary Production, Education, Other). (2) Census 2021 Mesh Block Counts xlsx from the Census guide-census-data page — multi-sheet workbook (one sheet per state/territory plus an AUS rollup and a notes sheet) carrying per-MB count of persons usually resident (Person column). Column header sits on row 7 of each data sheet. Pipeline fetcher: ``pipeline/fetch_abs_correspondence.py``.
Licence
Creative Commons Attribution 4.0 International (CC BY 4.0) live ↗
Coverage
All Australia. ~370,000 Mesh Blocks. MBs nest exactly within SA1, SA2, SA3, SA4, GCCSA, STE — no overlap, no gaps.
Geography standard
ASGS Edition 3 (2021).
Refresh
5-yearly with each Census; tied to the ASGS edition.
Attribution
Geography allocation and Mesh Block population: Australian Bureau of Statistics — ASGS Edition 3 (2021) Mesh Block allocation file and 2021 Census Mesh Block Counts. Licensed under CC BY 4.0.

Caveats

  • Person counts are place-of-usual-residence (where people live), not Census-night — less distorted by tourist destinations and fly-out workforces than Census-night estimates.
  • ABS small-count perturbation applies — MB-level counts under 3 are randomised. Aggregating up to SA2 washes most of this out.
  • MB_CATEGORY_2021 declares land-use class. The dasymetric mask combines both signals — MBs are kept only when category == 'Residential' AND population > 0. Category captures planning intent (an empty residential subdivision is still 'where housing goes'); population captures actual occupancy (an isolated declared usual residence on parkland isn't a residential area).

References

NPI 2023-24 DCCEEW Creative Commons Attribution 4.0 International (CC BY 4.0)
Full name
National Pollutant Inventory — Reporting Year 2023-24
Home
https://www.npi.gov.au/ live ↗
Alt
https://www.dcceew.gov.au/environment/protection/npi/data live ↗
Retrieval
data.gov.au CKAN package `npi` (UUID 043f58e0-a188-4458-b61c-04e5b540aea4). Resources: `facilities.geojson` (registry with lat/lng + ANZSIC) and `emissions.csv` (per-facility per-substance kg/year across air/water/land). Resource UUIDs hard-coded in `pipeline/fetch_npi.py`; refresh annually on or before 31 March.
Licence
Creative Commons Attribution 4.0 International (CC BY 4.0) live ↗
Coverage
All Australia. Per-facility point data (lat/lng) for facilities meeting NPI reporting thresholds. ~4,000 facilities reporting in recent years.
Geography
Per-facility lat/lng. Gazetteer attributes facilities to SA2 by point-in-polygon and rolls emissions up to per-SA2 totals for the ranking.
Refresh
Annual. NPI reports on Australian financial years (July–June); 2023-24 release published 2026-03-31.
Attribution
Emissions data sourced from the National Pollutant Inventory, Department of Climate Change, Energy, the Environment and Water (DCCEEW), reporting year 2023-24. Licensed under CC BY 4.0.

Caveats

  • NPI captures facilities above reporting thresholds only — small emitters are excluded by design.
  • Self-reported by facility operators; quality varies. Substance categories follow NPI's own taxonomy (~93 substances).
  • Emissions are annual totals — no temporal granularity within a reporting year.
  • Some facilities aggregate multiple sub-sites into a single report at one lat/lng — geographic precision is operator-dependent.
  • Per-SA2 ranking sums air_total + water + land kg across all substances and facilities sited in that SA2; mass is not chemistry-weighted, so a tonne of CO-equivalent and a tonne of lead are treated as equal mass.
  • Per-resident values divide facility emissions by Census usual-resident population — they measure local emission intensity, not personal footprint. An industrial SA2 with few residents will rank high by construction.
PlanningAlerts OpenAustralia Foundation Code: Apache-2.0. Data: per contributing council — published under terms set by each council, typically CC BY or equivalent open licence. Aggregator status: OpenAustralia Foundation is the redistributor, not the source-of-truth.
Full name
PlanningAlerts.org.au — operated by the OpenAustralia Foundation
Home
https://www.planningalerts.org.au/ live ↗
API docs
https://www.planningalerts.org.au/api/howto live ↗
Authorities
https://www.planningalerts.org.au/authorities live ↗
Retrieval
REST API, JSON responses. Free Community plan API key required (sign up at planningalerts.org.au). Rate limit: 1,000 requests / day. Pagination: 100 results / page. Endpoint pattern: GET /applications.json with bbox or lat/lng/radius parameters. Gazetteer paginates the AU bounding box daily to refresh a 90-day rolling window.
Licence
Code: Apache-2.0. Data: per contributing council — published under terms set by each council, typically CC BY or equivalent open licence. Aggregator status: OpenAustralia Foundation is the redistributor, not the source-of-truth. live ↗
Coverage
100+ Australian councils via a single API. Canonical authority list at the url_authorities link above. Coverage is not all-AU — gaps exist where councils have not been onboarded. Gazetteer publishes coverage state at /about/coverage so users can see which councils contribute.
Geography
Each DA carries lat/lng. Gazetteer attributes DAs to SA2 via point-in-polygon overlay against ASGS 2021 SA2 boundaries.
Refresh
Daily via GitHub Actions cron. Each refresh fetches DAs lodged or updated in the last 90 days.
Attribution
Development Application data sourced via PlanningAlerts.org.au, operated by the OpenAustralia Foundation. Original DA data is published by individual Australian councils.

Caveats

  • Coverage is council-by-council, not all-AU — absence of DAs in an SA2 may mean no development OR may mean the council does not contribute to PlanningAlerts.
  • DA descriptions are free-text from council planning portals and vary wildly in quality and detail.
  • DA status (lodged, approved, refused) is a snapshot — councils update on different cadences.
  • PlanningAlerts API rate limits constrain backfill: full historical reload takes multiple days at 1,000 req/day. Gazetteer uses rolling 90-day windows to stay under limits.
  • Some councils only publish summary data without precise lat/lng — these DAs are dropped from the spatial overlay and noted in the coverage page.

How the headline and highlights are picked

Each suburb page leads with one big headline rank and three smaller highlight ranks. All four are picked the same way: for every dimension Gazetteer carries a number on, we compute a cardinal rank within both the suburb's state and Australia overall, then pick whichever framing produces the lowest number — i.e. the most "top X" framing. The dimension with the very best (lowest) rank becomes the headline; the next three best become the highlights.

This is a deliberately blunt approach. It's not a significance claim — being #3 of 117 in ACT for one variable does not mean ACT distinguishes that variable in any statistically meaningful sense. The point is to give the reader a quick, honest entry point: "what does this suburb look like next to its peers?" — with every number linking back to the source row above. For a more rigorous treatment of variation see, e.g., any introductory statistics text on z-scores; Gazetteer may switch to z-based picks in a later pass if rank-only framing turns out to be too crude.

Attribution grammar

Every per-field stat on the site is stored as a triple { source_id, method_id, retrieved_at }. When a stat is tapped or hovered, that triple surfaces and links back to the relevant entry above. This keeps attribution honest: no number appears without a source, and no source appears only on the About page.

Acknowledgments

Gazetteer is a thin layer over years of work by other people. Every link below resolves through the Wayback Machine; the small live ↗ opens the provider's current page directly.

Names of individual contributors who have shaped specific design decisions land in this list as the project matures — if you've fed back on a Gazetteer page and you'd like to be named (or not named), say so via the support page.