Files
Neon-Desk/rust/fiscal-xbrl-core/BALANCE_SHEET_PARSER_SPEC.md

5.2 KiB

Balance Sheet Parser Spec

Purpose

This document defines the backend-only balance-sheet parsing rules for fiscal-xbrl-core.

This pass is limited to Rust parser behavior and taxonomy packs. It must not modify frontend files, frontend rendering logic, or frontend response shapes.

Hydration Order

  1. Load the selected surface pack.
  2. For non-core packs, merge in any core balance-sheet surfaces that the selected pack does not override.
  3. Resolve direct canonical balance rows from statement rows.
  4. Resolve aggregate-child rows from detail components when direct canonical rows are absent.
  5. Resolve formula-backed balance rows from already-resolved canonical rows.
  6. Emit unmapped only for rows not consumed by canonical balance parsing.

Category Taxonomy

Balance rows use these backend category keys:

  • current_assets
  • noncurrent_assets
  • current_liabilities
  • noncurrent_liabilities
  • equity
  • derived
  • sector_specific

Default rule:

  • use economic placement first
  • reserve sector_specific for rows that cannot be expressed economically

Canonical Precedence Rule

Canonical balance mappings take precedence over residual classification.

If a statement row is consumed by a canonical balance row, it must not remain in detail_rows["balance"]["unmapped"].

Alias Flattening Rule

Synonymous balance concepts flatten into one canonical surface row.

Example:

  • AccountsReceivableNetCurrent
  • ReceivablesNetCurrent

These must become one accounts_receivable row with period-aware provenance.

Per-Period Resolution Rule

Direct balance matching is resolved per period, not by choosing one row globally.

For each canonical balance row:

  1. Collect all direct candidates.
  2. For each period, choose the best candidate with a value in that period.
  3. Build one canonical row from those period-specific winners.
  4. Preserve the union of all consumed aliases in source_concepts, source_row_keys, and source_fact_ids.

Formula Evaluation Rule

Structured formulas are evaluated only after their source surface rows have been resolved.

Supported operators:

  • sum
  • subtract

Formula rules:

  • formulas operate period by period
  • sum may treat nulls as zero when treat_null_as_zero is true
  • subtract requires exactly two sources
  • formula rows inherit provenance from the source surface rows they consume

Residual Pruning Rule

balance.unmapped is a strict remainder set.

A balance statement row must be excluded from unmapped when either of these is true:

  • its row key was consumed by a canonical balance row
  • its concept key was consumed by a canonical balance row

Helper Surface Rule

Some balance rows are parser helpers rather than user-facing canonical output.

Current helper rows:

  • deferred_revenue_current
  • deferred_revenue_noncurrent
  • current_liabilities
  • leases

Behavior:

  • they remain available to formulas
  • they do not appear in emitted surface_rows
  • they do not create emitted detail buckets
  • they still consume matched backend sources so those rows do not leak into unmapped

Synonym vs Aggregate Child Rule

Two cases must remain distinct.

Synonym aliases

Different concept names for the same canonical balance meaning.

Behavior:

  • flatten into one canonical surface row
  • do not emit duplicate detail rows
  • do not remain in unmapped

Aggregate child components

Rows that legitimately roll into a subtotal or total.

Behavior:

  • may remain as detail rows beneath the canonical parent when grouping is enabled
  • must not remain in unmapped after being consumed

Sector Placement Decisions

Sector rows stay inside the same economic taxonomy.

Mappings in this pass:

  • loans -> noncurrent_assets
  • allowance_for_credit_losses -> noncurrent_assets
  • deposits -> current_liabilities
  • policy_liabilities -> noncurrent_liabilities
  • deferred_acquisition_costs -> noncurrent_assets
  • investment_property -> noncurrent_assets

sector_specific remains unused by default in this pass.

Required Invariants

  • A consumed balance source must never remain in balance.unmapped.
  • A synonym alias must never create more than one canonical balance row.
  • Hidden helper surfaces may consume sources but must not appear in emitted surface_rows.
  • Formula-derived rows inherit canonical provenance from their source surfaces.
  • The frontend response shape remains unchanged.

Test Matrix

The parser must cover:

  • direct alias flattening for accounts_receivable
  • period-sparse alias merges into one canonical row
  • formula derivation for total_cash_and_equivalents
  • formula derivation for unearned_revenue
  • formula derivation for total_debt
  • formula derivation for net_cash_position
  • helper rows staying out of emitted balance surfaces
  • residual pruning of canonically consumed balance rows
  • sector packs receiving merged core balance coverage without changing frontend contracts

Learnings Reusable For Other Statements

The same parser rules should later apply to cash flow:

  • canonical mapping outranks residual classification
  • direct aliases should resolve per period
  • helper rows can exist backend-only when formulas need them
  • consumed sources must be removed from unmapped
  • sector packs should inherit common canonical coverage rather than duplicating it