Source file
World Bank Data Processing
How we ingest and process World Development Indicators covering GDP, population, trade, education, and 50+ indicators for 217 economies.
Methodology brief
What this file explains
This source file shows how World Bank Data Processing data moves from public release to published WorldStats pages: collection, validation, transformation, coverage limits, and known caveats.
- 01 Source
- 02 Ingest
- 03 Validate
- 04 Publish
Overview
The World Bank's World Development Indicators (WDI) is the primary source for economic, demographic, and development data on WorldStats. WDI covers over 1,400 indicators for 217 economies, with time series going back to 1960 for many indicators.
Data Ingestion
Data is fetched via the World Bank Indicators API (v2) using each indicator's API code (e.g., SP.POP.TOTL for population). We request all available years and countries per indicator, then store the most recent available data point for each country-indicator pair. The ingestion process runs periodically and tracks which indicators have been updated.
Processing and Validation
Each data point includes the country ISO2 code, indicator ID, year, and numeric value. We validate that values fall within expected ranges for each indicator type (e.g., percentages between 0–100, population > 0). Null or missing values are excluded rather than interpolated — we only show data that the World Bank has actually reported.
Coverage
Coverage varies significantly by indicator. Population and GDP data are available for nearly all 217 economies with recent years. Specialized indicators like military expenditure or research spending may only cover 100–150 countries and lag by 1–3 years. The latest available year is displayed on each indicator page.
Known Limitations
World Bank data may lag 1–2 years behind current reality, especially for developing economies. Some indicators rely on surveys conducted every 3–5 years rather than annually. Small territories and non-UN-member entities may have limited coverage. The World Bank occasionally revises historical data, which we incorporate on the next refresh cycle.