Methodology
How we source, process, and present data. Full transparency.
Source Data vs. Analyzed Data
We clearly distinguish between two types of content:
**Source Data** (black badge): Raw values from Statistics Canada, unmodified. When you see a number on the Releases page, it's exactly what StatCan published.
**Analyzed Data** (red badge): Computed metrics like year-over-year change, inflation rates from CPI indices, and AI-generated narratives. These involve calculations or interpretations.
Every page is labeled so you always know which you're looking at.
Inflation Rate
We calculate inflation as the year-over-year percentage change in the Consumer Price Index (CPI).
**Formula**: ((CPI_current - CPI_12months_ago) / CPI_12months_ago) × 100
**Source**: Statistics Canada Table 18-10-0004-01, CPI All-items **Vector**: v41690973 (Canada) **Frequency**: Monthly
The CPI is an index with base 2002=100. A value of 165 means prices are 65% higher than in 2002. We compute YoY change to get the familiar "inflation rate" percentage.
Unemployment Rate
We display the seasonally adjusted unemployment rate as published by Statistics Canada's Labour Force Survey.
**Source**: Statistics Canada Table 14-10-0287-01 **Vector**: v2062815 (Canada, 15+, both sexes, seasonally adjusted) **Frequency**: Monthly
No transformation is applied — this is the rate StatCan publishes. Provincial rates use the same survey, different geography dimension.
GDP Growth
We calculate annualized quarter-over-quarter GDP growth.
**Formula**: ((GDP_current / GDP_previous)^4 - 1) × 100
**Source**: Statistics Canada Table 36-10-0104-01, expenditure-based GDP **Vector**: v62305752 **Frequency**: Quarterly
Annual growth rates on our charts are computed from the annual average of quarterly GDP levels.
Real Wage Growth
We compare wage growth to CPI growth to determine if purchasing power is increasing or decreasing.
**Formula**: (Wage YoY%) - (CPI YoY%) = Real wage growth
**Sources**: - Wages: Table 14-10-0287-01, average hourly wages (v2062809) - CPI: Table 18-10-0004-01, all-items (v41690973)
A positive number means wages grew faster than prices.
AI-Generated Content
We use Mistral Small 4 (via OpenRouter) to generate:
- **Release headlines**: AI picks the most newsworthy figure from each StatCan release - **Analytics narratives**: AI writes analysis based on actual data trends - **Data stories**: AI generates multi-section narratives from StatCan vectors
All AI content is: 1. Generated from real StatCan data (never fabricated) 2. Labeled as "Analysis" or "Data Story" 3. Cross-checked by our verification script 4. Linked to source tables for fact-checking
We do NOT use AI to generate or modify source data values.
Data Pipeline
Our ETL pipeline runs weekdays at 9 AM ET:
1. **Extract**: Query StatCan's getChangedCubeList API for recent releases 2. **Enrich**: Fetch metadata and latest data points for each release 3. **AI Headlines**: Mistral Small 4 picks key figures and writes summaries 4. **Analytics**: Refresh 17 topic indicators from specific vectors 5. **Store**: All data stored in Supabase with full provenance 6. **Track**: Every run logged in pipeline_runs with step-level metrics
The pipeline status page (/pipeline) shows the full history.
Semantic Search
We generate text embeddings for all 8,148+ table titles using a transformer model, stored in Supabase with pgvector.
When you search, your query is embedded and compared against all table embeddings using cosine similarity. Results are ranked by relevance score.
This allows natural language queries like "average rent in Toronto" to find the right StatCan table even if the title doesn't contain those exact words.
Questions about our methodology? Open an issue on our GitHub or contact us.