-
Notifications
You must be signed in to change notification settings - Fork 29
Home
Here's how to create a visualization for your province or city. At a high-level:
- Find the data source
- Convert the data into a usable format
- Create the visualization
- Verify the data
Visualizations are built based on revenue (what the government makes) and expenses (what the government spends). To make them engaging, we go as granular as possible – especially on the spending side – typically 2-4 levels deep.
We use actual data from the most recent fiscal year, not budgeted projections. That is, what did the government actually spend, not what did they plan to spend. Fiscal years run from April 1 to March 31, with a delay before the data is released. As of July 2025, the most recent actual data available is for for 2023-2024.
These are usually in the government's Public Accounts. They're public and searchable. Here's where we found federal and Ontario.
We're typically looking for two things:
- Granular spending data – Ideally, line-by-line expenses by ministry. For example, Ontario breaks down each ministry's spending here.
- Audited reports – These are reviewed by an independent third-party and serve as the source of truth. Ontario's Annual Reports and Consolidated Financial Statements are a good example.
Note 1: Ontario's Annual Report summarizes figures by sector (e.g. health) whereas the line-by-line data is by ministry (e.g. Ministry of Health). These can't be directly compared. We had to dig in to find Schedule 4: Expense by Ministry from the Consolidated Financial Statements to reconcile the two.
Note 2: We used unaudited line-by-line expenses as the baseline for the visualization. We added an Unreported line item to every ministry to make the totals match up to the audited reports. See spreadsheet here. It's not clear why the numbers are different – presumably the expenses reported by each Ministry is not audited.
If you're lucky, the data comes in CSV or JSON. If it's in a PDF, then you'll need to extract the data.
Create a script (or ask an LLM) to generate the Sankey diagram. Refer to federal or Ontario examples for structure.
- Are all figures actuals, not budgeted?
- Does
Revenue = Spending + Deficit/Surplus? - Does Revenue and Spending match the audited reports?
- Gut check: Do the groupings make sense? Are small amounts better grouped together?
- Gut check: Is the level of granularity reasonable? Should we collapse the number of levels?
The actual spending and revenue data for Toronto was available from 2 resources:
- Resource 1: https://www.toronto.ca/city-government/budget-finances/city-finance/annual-financial-report/
- Resource 2: https://open.toronto.ca/dataset/revenues-and-expenses/
The key difference and benefit of using both the sources was the breakdown it provided. Resource 1 had good breakdowns on revenue side which were not available in Resource 2, whereas Resource 2 had good breakdowns on Expense side.
Though using two resources had its own problems as some of the numbers did not match. For example, Transporation expense as per R1 was 4460 mil vs 4518.46 in R2. Total operational revenue in both resources was 18202 mil but the expense total in the final graph had to be adjusted to 16567.19 as compared to 16186 as reported in R1.