13 KiB
Taxonomy Architecture
Overview
The taxonomy system defines all financial surfaces, computed ratios, and KPIs used throughout the application. The Rust JSON files in rust/taxonomy/ serve as the single source of truth for all financial definitions.
Data Flow
┌─────────────────────────────────────────────────────────────────┐
│ rust/taxonomy/fiscal/v1/ │
│ │
│ core.surface.json - Income/Balance/Cash Flow/Equity │
│ core.computed.json - Ratio definitions │
│ core.kpis.json - Sector-specific KPIs │
│ core.income-bridge.json - Income statement mapping rules │
│ │
│ *.surface.json - Core plus industry-specific packs │
│ *.income-bridge.json - Pack-specific universal income maps │
│ kpis/*.kpis.json - Pack KPI bundles │
└──────────────────────────┬──────────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌────────────────┐ ┌──────────────┐
│ Rust Sidecar│ │ TS Generator │ │ TypeScript │
│ fiscal-xbrl │ │ scripts/ │ │ Runtime │
│ │ │ generate- │ │ │
│ Parses XBRL │ │ taxonomy.ts │ │ UI/API │
│ Maps to │ │ │ │ │
│ surfaces │ │ Generates TS │ │ Uses generated│
│ Computes │ │ types & consts │ │ definitions │
│ ratios │ │ │ │ │
└──────┬──────┘ └───────┬────────┘ └──────┬───────┘
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ lib/generated│ │
│ │ (gitignored) │ │
│ │ │ │
│ │ surfaces/ │ │
│ │ computed/ │ │
│ │ kpis/ │ │
│ └──────┬───────┘ │
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ lib/financial-metrics.ts │
│ │
│ Thin wrapper that: │
│ - Re-exports generated types │
│ - Provides UI-specific types (GraphableFinancialSurface) │
│ - Transforms surfaces to metric definitions │
└─────────────────────────────────────────────────────────────┘
File Structure
rust/taxonomy/fiscal/v1/
├── core.surface.json # Core financial surfaces
├── core.computed.json # Ratio definitions (32 ratios)
├── core.income-bridge.json # Income statement XBRL mapping
├── core.kpis.json # Core KPIs (mostly empty)
├── universal_income.surface.json
│
├── bank_lender.surface.json
├── insurance.surface.json
├── reit_real_estate.surface.json
├── broker_asset_manager.surface.json
├── agriculture.surface.json
├── contractors_construction.surface.json
├── contractors_federal_government.surface.json
├── development_stage.surface.json
├── entertainment_*.surface.json
├── extractive_mining.surface.json
├── mortgage_banking.surface.json
├── title_plant.surface.json
├── franchisors.surface.json
├── not_for_profit.surface.json
├── plan_defined_*.surface.json
├── plan_health_welfare.surface.json
├── real_estate_*.surface.json
├── software.surface.json
├── steamship.surface.json
└── kpis/
└── *.kpis.json
lib/generated/ # Auto-generated, gitignored
├── index.ts
├── types.ts
├── surfaces/
│ ├── index.ts
│ ├── income.ts
│ ├── balance.ts
│ └── cash_flow.ts
├── computed/
│ ├── index.ts
│ └── core.ts
└── kpis/
├── index.ts
└── *.ts
Surface Definitions
Surfaces represent canonical financial line items. Each surface maps XBRL concepts to a standardized key. Generated TypeScript statement catalogs are built from the deduped union of core plus unique non-core surfaces, with core definitions winning for shared universal keys.
{
"surface_key": "revenue",
"statement": "income",
"label": "Revenue",
"category": "surface",
"order": 10,
"unit": "currency",
"rollup_policy": "direct_or_formula",
"allowed_source_concepts": [
"us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax",
"us-gaap:SalesRevenueNet"
],
"formula_fallback": null
}
Surface Fields
| Field | Type | Description |
|---|---|---|
surface_key |
string | Unique identifier (snake_case) |
statement |
enum | income, balance, cash_flow, equity, comprehensive_income, disclosure |
label |
string | Human-readable label |
category |
string | Grouping category |
order |
number | Display order |
unit |
enum | currency, percent, ratio, shares, count |
rollup_policy |
string | How to aggregate: direct_only, direct_or_formula, aggregate_children, formula_only |
allowed_source_concepts |
string[] | XBRL concepts that map to this surface |
formula_fallback |
object | Optional formula when no direct mapping |
Computed Definitions
Computed definitions describe ratios and derived metrics. They are split into two phases:
Phase 1: Filing-Derived (Rust computes)
Ratios computable from filing data alone:
- Margins: gross_margin, operating_margin, ebitda_margin, net_margin, fcf_margin
- Returns: roa, roe, roic, roce
- Financial Health: debt_to_equity, net_debt_to_ebitda, cash_to_debt, current_ratio
- Per-Share: revenue_per_share, fcf_per_share, book_value_per_share
- Growth: revenue_yoy, net_income_yoy, eps_yoy, fcf_yoy, *_cagr
Phase 2: Market-Derived (TypeScript computes)
Ratios requiring external price data:
- Valuation: marketcap, enterprise_value, price_to_earnings, price_to_fcf, price_to_book, ev_to*
{
"key": "gross_margin",
"label": "Gross Margin",
"category": "margins",
"order": 10,
"unit": "percent",
"computation": {
"type": "ratio",
"numerator": "gross_profit",
"denominator": "revenue"
}
}
{
"key": "price_to_earnings",
"label": "Price to Earnings",
"category": "valuation",
"order": 270,
"unit": "ratio",
"computation": {
"type": "simple",
"formula": "price / diluted_eps"
},
"requires_external_data": ["price"]
}
Computation Types
| Type | Fields | Description |
|---|---|---|
ratio |
numerator, denominator | Simple division |
yoy_growth |
source | Year-over-year percentage change |
cagr |
source, years | Compound annual growth rate |
per_share |
source, shares_key | Divide by share count |
simple |
formula | Custom formula expression |
Pack Inheritance
Non-core packs inherit balance and cash_flow surfaces from core:
// taxonomy_loader.rs
if !matches!(pack, FiscalPack::Core) {
// Inherit balance + cash_flow from core
// Override with pack-specific definitions
}
This ensures consistency across packs while allowing sector-specific income statements.
Auto-classification remains conservative. Pack selection uses concept and role scoring, then falls back to core when the top match is weak or ambiguous.
Issuer Overlay Automation
Issuer overlays now support a runtime, database-backed path in addition to checked-in JSON files. Explicit user ticker submits enqueue filing sync through POST /api/tickers/ensure; the sync task hydrates filings with the current overlay revision, generates additive issuer mappings from residual extension concepts, and immediately rehydrates recent filings when a new overlay revision is published.
Automation is intentionally conservative:
- it only extends existing canonical surfaces
- it does not synthesize new surfaces
- it does not auto-delete prior mappings
Runtime overlay merge order is:
- pack primary/disclosure
- core primary/disclosure
- static issuer overlay file
- runtime issuer overlay
No-role statement admission is taxonomy-aware:
- primary statement admission is allowed only when a concept matches a primary statement surface
- disclosure-only concepts are excluded from surfaced primary statements
- explicit overlap handling exists for shared balance/equity concepts such as
StockholdersEquityandLiabilitiesAndStockholdersEquity
Build Pipeline
# Generate TypeScript from Rust JSON
bun run generate
# Build Rust sidecar (includes taxonomy)
bun run build:sidecar
# Full build (generates + compiles)
bun run build
package.json Scripts
| Script | Description |
|---|---|
generate |
Run taxonomy generator |
build:sidecar |
Build Rust binary |
build |
Generate + Next.js build |
lint |
Generate + TypeScript check |
Validation
The generator validates:
- No duplicate surface keys within the same statement
- All ratio numerators/denominators reference existing surfaces
- Required fields present on all definitions
- Valid statement/unit/category values
Run validation:
bun run generate # Validates during generation
Extending the Taxonomy
Adding a New Surface
- Edit
rust/taxonomy/fiscal/v1/core.surface.json - Add surface definition with unique key
- Run
bun run generateto regenerate TypeScript - Run
bun run build:sidecarto rebuild Rust
Adding a New Ratio
- Edit
rust/taxonomy/fiscal/v1/core.computed.json - Add computed definition with computation spec
- If market-derived, add
requires_external_data - Run
bun run generate
Adding a New Sector Pack
- Create
rust/taxonomy/fiscal/v1/<pack>.surface.json - Create
rust/taxonomy/fiscal/v1/<pack>.income-bridge.json - Create
rust/taxonomy/fiscal/v1/<pack>.kpis.json(if needed) - Add pack to
PACK_ORDERinscripts/generate-taxonomy.ts - Add pack to
FiscalPackenum inrust/fiscal-xbrl-core/src/pack_selector.rs - Run
bun run generate && bun run build:sidecar
Design Decisions
Why Rust JSON as Source of Truth?
- Single definition: XBRL mapping and TypeScript use the same definitions
- Type safety: Rust validates JSON at compile time
- Performance: No runtime JSON parsing in TypeScript
- Consistency: Impossible for Rust and TypeScript to drift
Why Gitignore Generated Files?
- Single source of truth: Forces changes through Rust JSON
- No merge conflicts: Generated code never conflicts
- Smaller repo: No large generated files in history
- CI validation: CI regenerates and validates
Why Two-Phase Ratio Computation?
- Filing-derived ratios: Can be computed at parse time by Rust
- Market-derived ratios: Require real-time price data
- Separation of concerns: Rust handles XBRL, TypeScript handles market data
- Same definitions: Both phases use the same computation specs