Source file
City Data Sources
How we source and process geographic and demographic data for 190,000+ cities worldwide.
Methodology brief
What this file explains
This source file shows how City Data Sources data moves from public release to published WorldStats pages: collection, validation, transformation, coverage limits, and known caveats.
- 01 Source
- 02 Ingest
- 03 Validate
- 04 Publish
Overview
City data on WorldStats comes primarily from the GeoNames geographical database, which provides standardized city information including coordinates, population, elevation, timezone, and administrative region for millions of places worldwide.
Data Sources
The primary source is GeoNames' cities500 dataset, which includes populated places with populations above 500. This is supplemented with data from REST Countries for country-level metadata (capital status, currency, languages) and Open-Meteo for city-level climate data using the same aggregation methodology as country climate data.
Population Thresholds
WorldStats indexes cities with populations above 100,000 for full profiles with climate data, and populated places above 500 for the broader city database and search index. The profile threshold ensures richer city pages have enough supporting data, while the lower GeoNames threshold keeps smaller places discoverable.
Known Limitations
City population figures are based on the most recent GeoNames data, which varies in recency by country. Some cities may have populations based on census data that is 5–10 years old. Administrative boundaries and city definitions vary by country, making direct population comparisons imprecise for metropolitan vs. municipal populations.