Expand backend financial statement parsers

2026-03-12 21:15:54 -04:00
parent 33ce48f53c
commit 7a7a78340f
13 changed files with 4398 additions and 456 deletions
--- a/rust/fiscal-xbrl-core/BALANCE_SHEET_PARSER_SPEC.md
+++ b/rust/fiscal-xbrl-core/BALANCE_SHEET_PARSER_SPEC.md
@@ -0,0 +1,144 @@
+# Balance Sheet Parser Spec
+
+## Purpose
+This document defines the backend-only balance-sheet parsing rules for `fiscal-xbrl-core`.
+
+This pass is limited to Rust parser behavior and taxonomy packs. It must not modify frontend files, frontend rendering logic, or frontend response shapes.
+
+## Hydration Order
+1. Load the selected surface pack.
+2. For non-core packs, merge in any core balance-sheet surfaces that the selected pack does not override.
+3. Resolve direct canonical balance rows from statement rows.
+4. Resolve aggregate-child rows from detail components when direct canonical rows are absent.
+5. Resolve formula-backed balance rows from already-resolved canonical rows.
+6. Emit `unmapped` only for rows not consumed by canonical balance parsing.
+
+## Category Taxonomy
+Balance rows use these backend category keys:
+- `current_assets`
+- `noncurrent_assets`
+- `current_liabilities`
+- `noncurrent_liabilities`
+- `equity`
+- `derived`
+- `sector_specific`
+
+Default rule:
+- use economic placement first
+- reserve `sector_specific` for rows that cannot be expressed economically
+
+## Canonical Precedence Rule
+Canonical balance mappings take precedence over residual classification.
+
+If a statement row is consumed by a canonical balance row, it must not remain in `detail_rows["balance"]["unmapped"]`.
+
+## Alias Flattening Rule
+Synonymous balance concepts flatten into one canonical surface row.
+
+Example:
+- `AccountsReceivableNetCurrent`
+- `ReceivablesNetCurrent`
+
+These must become one `accounts_receivable` row with period-aware provenance.
+
+## Per-Period Resolution Rule
+Direct balance matching is resolved per period, not by choosing one row globally.
+
+For each canonical balance row:
+1. Collect all direct candidates.
+2. For each period, choose the best candidate with a value in that period.
+3. Build one canonical row from those period-specific winners.
+4. Preserve the union of all consumed aliases in `source_concepts`, `source_row_keys`, and `source_fact_ids`.
+
+## Formula Evaluation Rule
+Structured formulas are evaluated only after their source surface rows have been resolved.
+
+Supported operators:
+- `sum`
+- `subtract`
+
+Formula rules:
+- formulas operate period by period
+- `sum` may treat nulls as zero when `treat_null_as_zero` is true
+- `subtract` requires exactly two sources
+- formula rows inherit provenance from the source surface rows they consume
+
+## Residual Pruning Rule
+`balance.unmapped` is a strict remainder set.
+
+A balance statement row must be excluded from `unmapped` when either of these is true:
+- its row key was consumed by a canonical balance row
+- its concept key was consumed by a canonical balance row
+
+## Helper Surface Rule
+Some balance rows are parser helpers rather than user-facing canonical output.
+
+Current helper rows:
+- `deferred_revenue_current`
+- `deferred_revenue_noncurrent`
+- `current_liabilities`
+- `leases`
+
+Behavior:
+- they remain available to formulas
+- they do not appear in emitted `surface_rows`
+- they do not create emitted detail buckets
+- they still consume matched backend sources so those rows do not leak into `unmapped`
+
+## Synonym vs Aggregate Child Rule
+Two cases must remain distinct.
+
+### Synonym aliases
+Different concept names for the same canonical balance meaning.
+
+Behavior:
+- flatten into one canonical surface row
+- do not emit duplicate detail rows
+- do not remain in `unmapped`
+
+### Aggregate child components
+Rows that legitimately roll into a subtotal or total.
+
+Behavior:
+- may remain as detail rows beneath the canonical parent when grouping is enabled
+- must not remain in `unmapped` after being consumed
+
+## Sector Placement Decisions
+Sector rows stay inside the same economic taxonomy.
+
+Mappings in this pass:
+- `loans` -> `noncurrent_assets`
+- `allowance_for_credit_losses` -> `noncurrent_assets`
+- `deposits` -> `current_liabilities`
+- `policy_liabilities` -> `noncurrent_liabilities`
+- `deferred_acquisition_costs` -> `noncurrent_assets`
+- `investment_property` -> `noncurrent_assets`
+
+`sector_specific` remains unused by default in this pass.
+
+## Required Invariants
+- A consumed balance source must never remain in `balance.unmapped`.
+- A synonym alias must never create more than one canonical balance row.
+- Hidden helper surfaces may consume sources but must not appear in emitted `surface_rows`.
+- Formula-derived rows inherit canonical provenance from their source surfaces.
+- The frontend response shape remains unchanged.
+
+## Test Matrix
+The parser must cover:
+- direct alias flattening for `accounts_receivable`
+- period-sparse alias merges into one canonical row
+- formula derivation for `total_cash_and_equivalents`
+- formula derivation for `unearned_revenue`
+- formula derivation for `total_debt`
+- formula derivation for `net_cash_position`
+- helper rows staying out of emitted balance surfaces
+- residual pruning of canonically consumed balance rows
+- sector packs receiving merged core balance coverage without changing frontend contracts
+
+## Learnings Reusable For Other Statements
+The same parser rules should later apply to cash flow:
+- canonical mapping outranks residual classification
+- direct aliases should resolve per period
+- helper rows can exist backend-only when formulas need them
+- consumed sources must be removed from `unmapped`
+- sector packs should inherit common canonical coverage rather than duplicating it
--- a/rust/fiscal-xbrl-core/CASH_FLOW_STATEMENT_PARSER_SPEC.md
+++ b/rust/fiscal-xbrl-core/CASH_FLOW_STATEMENT_PARSER_SPEC.md
@@ -0,0 +1,155 @@
+# Cash Flow Statement Parser Spec
+
+## Purpose
+This document defines the backend-only cash-flow parsing rules for `fiscal-xbrl-core`.
+
+This pass is limited to Rust parser behavior, taxonomy packs, and backend comparison tooling. It must not modify frontend files, frontend rendering logic, or frontend response shapes.
+
+## Hydration Order
+1. Load the selected surface pack.
+2. For non-core packs, merge in any core balance-sheet and cash-flow surfaces that the selected pack does not override.
+3. Resolve direct canonical cash-flow rows from statement rows.
+4. Resolve aggregate-child cash-flow rows from matched detail components when direct canonical rows are absent.
+5. Resolve formula-backed cash-flow rows from already-resolved canonical rows and helper rows.
+6. Emit `unmapped` only for rows not consumed by canonical cash-flow parsing.
+
+## Category Model
+Cash-flow rows use these backend category keys:
+- `operating`
+- `investing`
+- `financing`
+- `free_cash_flow`
+- `helper`
+
+Rules:
+- `helper` rows are backend-only and use `include_in_output: false`.
+- Only `operating`, `investing`, `financing`, and `free_cash_flow` should appear in emitted `surface_rows`.
+
+## Canonical Precedence Rule
+Canonical cash-flow mappings take precedence over residual classification.
+
+If a statement row is consumed by a canonical cash-flow row, it must not remain in `detail_rows["cash_flow"]["unmapped"]`.
+
+## Alias Flattening Rule
+Synonymous cash-flow concepts flatten into one canonical surface row.
+
+Example:
+- `NetCashProvidedByUsedInOperatingActivities`
+- `NetCashProvidedByUsedInOperatingActivitiesContinuingOperations`
+
+These must become one `operating_cash_flow` row with period-aware provenance.
+
+## Per-Period Resolution Rule
+Direct cash-flow matching is resolved per period, not by choosing one row globally.
+
+For each canonical cash-flow row:
+1. Collect all direct candidates.
+2. For each period, choose the best candidate with a value in that period.
+3. Build one canonical row from those period-specific winners.
+4. Preserve the union of all consumed aliases in `source_concepts`, `source_row_keys`, and `source_fact_ids`.
+
+## Sign Normalization Rule
+Some canonical cash-flow rows require sign normalization.
+
+Supported transform:
+- `invert`
+
+Rule:
+- sign transforms are applied after direct or aggregate resolution
+- sign transforms are applied before formula evaluation consumes the row
+- emitted detail rows inherit the same transform when they belong to the transformed canonical row
+- provenance is preserved unchanged
+
+## Formula Rule
+Structured formulas are evaluated only after their source surface rows have been resolved.
+
+Supported operators:
+- `sum`
+- `subtract`
+
+Current formulas:
+- `changes_unearned_revenue = contract_liability_incurred - contract_liability_recognized`
+- `changes_other_operating_activities = changes_other_current_assets + changes_other_current_liabilities + changes_other_noncurrent_assets + changes_other_noncurrent_liabilities`
+- `free_cash_flow = operating_cash_flow + capital_expenditures`
+
+## Helper Row Rule
+Helper rows exist only to support formulas and canonical grouping.
+
+Current helper rows:
+- `contract_liability_incurred`
+- `contract_liability_recognized`
+- `changes_other_current_assets`
+- `changes_other_current_liabilities`
+- `changes_other_noncurrent_assets`
+- `changes_other_noncurrent_liabilities`
+
+Behavior:
+- helper rows remain available for formula evaluation
+- helper rows do not appear in emitted `surface_rows`
+- helper rows do not create emitted detail buckets
+- helper rows still consume matched backend sources so those rows do not leak into `unmapped`
+
+## Residual Pruning Rule
+`cash_flow.unmapped` is a strict remainder set.
+
+A cash-flow statement row must be excluded from `unmapped` when either of these is true:
+- its row key was consumed by a canonical cash-flow row
+- its concept key was consumed by a canonical cash-flow row
+
+## Sector Inheritance Rule
+Sector packs inherit the core cash-flow taxonomy unless they provide an explicit cash-flow override.
+
+Current behavior:
+- bank/lender inherits core cash-flow rows
+- broker/asset manager inherits core cash-flow rows
+- insurance inherits core cash-flow rows
+- REIT/real estate inherits core cash-flow rows
+
+No first-pass sector-specific cash-flow overrides are required.
+
+## Synonym vs Aggregate Child Rule
+Two cases must remain distinct.
+
+### Synonym aliases
+Different concept names for the same canonical cash-flow meaning.
+
+Behavior:
+- flatten into one canonical surface row
+- do not emit duplicate detail rows
+- do not remain in `unmapped`
+
+### Aggregate child components
+Rows that legitimately roll into a subtotal or grouped adjustment row.
+
+Behavior:
+- may remain as detail rows beneath the canonical parent when grouping is enabled
+- must not remain in `unmapped` after being consumed
+
+## Required Invariants
+- A consumed cash-flow source must never remain in `cash_flow.unmapped`.
+- A synonym alias must never create more than one canonical cash-flow row.
+- Hidden helper surfaces may consume sources but must not appear in emitted `surface_rows`.
+- Formula-derived rows inherit canonical provenance from their source surfaces.
+- The frontend response shape remains unchanged.
+
+## Test Matrix
+The parser must cover:
+- direct sign inversion for `capital_expenditures`
+- direct sign inversion for `debt_repaid`
+- direct sign inversion for `share_repurchases`
+- direct mapping for `operating_cash_flow`
+- formula derivation for `changes_unearned_revenue`
+- formula derivation for `changes_other_operating_activities`
+- formula derivation for `free_cash_flow`
+- helper rows staying out of emitted cash-flow surfaces
+- residual pruning of canonically consumed cash-flow rows
+- sector packs receiving merged core cash-flow coverage without changing frontend contracts
+- fallback classification for fact-only cash-flow concepts such as `IncreaseDecreaseInAccountsReceivable` and `PaymentsOfDividends`
+
+## Learnings Reusable For Other Statements
+The same parser rules now apply consistently across income, balance, and cash flow:
+- canonical mapping outranks residual classification
+- direct aliases resolve per period
+- helper rows may exist backend-only when formulas need them
+- consumed sources must be removed from `unmapped`
+- sector packs inherit common canonical coverage instead of duplicating it
--- a/rust/fiscal-xbrl-core/OPERATING_STATEMENT_PARSER_SPEC.md
+++ b/rust/fiscal-xbrl-core/OPERATING_STATEMENT_PARSER_SPEC.md
@@ -0,0 +1,103 @@
+# Operating Statement Parser Spec
+
+## Purpose
+This document defines the backend-only parsing rules for operating statement hydration in `fiscal-xbrl-core`.
+
+This pass is intentionally limited to Rust parser behavior. It must not change frontend files, frontend rendering logic, or API response shapes.
+
+## Hydration Order
+1. Generic compact surface mapping builds initial `surface_rows`, `detail_rows`, and `unmapped` residuals.
+2. Universal income parsing rewrites the income statement into canonical operating-statement rows.
+3. Canonical income parsing is authoritative for income provenance and must prune any consumed residual rows from `detail_rows["income"]["unmapped"]`.
+
+## Canonical Precedence Rule
+For income rows, canonical universal mappings take precedence over generic residual classification.
+
+If an income concept is consumed by a canonical operating-statement row, it must not remain in `unmapped`.
+
+## Alias Flattening Rule
+Multiple source aliases for the same canonical operating-statement concept must flatten into a single canonical surface row.
+
+Examples:
+- `us-gaap:OtherOperatingExpense`
+- `us-gaap:OtherOperatingExpenses`
+- `us-gaap:OtherCostAndExpenseOperating`
+
+These may differ by filer or period, but they still represent one canonical row such as `other_operating_expense`.
+
+## Per-Period Resolution Rule
+Direct canonical matching is resolved per period, not by selecting one global winner for all periods.
+
+For each canonical income row:
+1. Collect all direct statement-row matches.
+2. For each period, keep only candidates with a value in that period.
+3. Choose the best candidate for that period using existing ranking rules.
+4. Build one canonical row whose `values` and `resolved_source_row_keys` are assembled period-by-period.
+
+The canonical row's provenance is the union of all consumed aliases, even if a different alias wins in different periods.
+
+## Residual Pruning Rule
+After canonical income rows are resolved:
+- collect all consumed source row keys
+- collect all consumed concept keys
+- remove any residual income detail row from `unmapped` if either identifier matches
+
+`unmapped` is a strict remainder set after income canonicalization.
+
+## Synonym vs Aggregate Child Rule
+Two cases must remain distinct:
+
+### Synonym aliases
+Different concept names representing the same canonical meaning.
+
+Behavior:
+- flatten into one canonical surface row
+- do not emit as detail rows
+- do not leave in `unmapped`
+
+### Aggregate child components
+Rows that are true components of a higher-level canonical row, such as:
+- `SalesAndMarketingExpense`
+- `GeneralAndAdministrativeExpense`
+used to derive `selling_general_and_administrative`
+
+Behavior:
+- may appear as detail rows under the canonical parent
+- must not also remain in `unmapped` once consumed by that canonical parent
+
+## Required Invariants
+For income parsing, a consumed source may appear in exactly one of these places:
+- canonical surface provenance
+- canonical detail provenance
+- `unmapped`
+
+It must never appear in more than one place at the same time.
+
+Additional invariants:
+- canonical surface rows are unique by canonical key
+- aliases are flattened into one canonical row
+- `resolved_source_row_keys` are period-specific
+- normalization counts reflect the post-pruning state
+
+## Performance Constraints
+- Use `HashSet` membership for consumed-source pruning.
+- Build candidate collections once per canonical definition.
+- Avoid UI-side dedupe or post-processing.
+- Keep the parser close to linear in candidate volume per definition.
+
+## Test Matrix
+The parser must cover:
+- direct alias dedupe for `other_operating_expense`
+- period-sparse alias merge into a single canonical row
+- pruning of canonically consumed aliases from `income.unmapped`
+- preservation of truly unrelated residual rows
+- pruning of formula-consumed component rows from `income.unmapped`
+
+## Learnings For Other Statements
+The same backend rules should later be applied to balance sheet and cash flow:
+- canonical mapping must outrank residual classification
+- alias resolution should be per-period
+- consumed sources must be removed from `unmapped`
+- synonym aliases and aggregate child components must be treated differently
+
+When balance sheet and cash flow are upgraded, they should adopt these invariants without changing frontend response shapes.
--- a/rust/fiscal-xbrl-core/src/lib.rs
+++ b/rust/fiscal-xbrl-core/src/lib.rs
@@ -37,10 +37,12 @@ static IDENTIFIER_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?identifier\b[^>]*\bscheme=["']([^"']+)["'][^>]*>(.*?)</(?:[a-z0-9_\-]+:)?identifier>"#).unwrap()
 });
 static SEGMENT_RE: Lazy<Regex> = Lazy::new(|| {
-    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?segment\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?segment>"#).unwrap()
+    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?segment\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?segment>"#)
+        .unwrap()
 });
 static SCENARIO_RE: Lazy<Regex> = Lazy::new(|| {
-    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?scenario\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?scenario>"#).unwrap()
+    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?scenario\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?scenario>"#)
+        .unwrap()
 });
 static START_DATE_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?startDate>(.*?)</(?:[a-z0-9_\-]+:)?startDate>"#).unwrap()
@@ -55,7 +57,8 @@ static MEASURE_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?measure>(.*?)</(?:[a-z0-9_\-]+:)?measure>"#).unwrap()
 });
 static LABEL_LINK_RE: Lazy<Regex> = Lazy::new(|| {
-    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?labelLink\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?labelLink>"#).unwrap()
+    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?labelLink\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?labelLink>"#)
+        .unwrap()
 });
 static PRESENTATION_LINK_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?presentationLink\b([^>]*)>(.*?)</(?:[a-z0-9_\-]+:)?presentationLink>"#).unwrap()
@@ -67,12 +70,14 @@ static LABEL_RESOURCE_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?label\b([^>]*)>(.*?)</(?:[a-z0-9_\-]+:)?label>"#).unwrap()
 });
 static LABEL_ARC_RE: Lazy<Regex> = Lazy::new(|| {
-    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?labelArc\b([^>]*)/?>(?:</(?:[a-z0-9_\-]+:)?labelArc>)?"#).unwrap()
+    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?labelArc\b([^>]*)/?>(?:</(?:[a-z0-9_\-]+:)?labelArc>)?"#)
+        .unwrap()
 });
 static PRESENTATION_ARC_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?presentationArc\b([^>]*)/?>(?:</(?:[a-z0-9_\-]+:)?presentationArc>)?"#).unwrap()
 });
-static ATTR_RE: Lazy<Regex> = Lazy::new(|| Regex::new(r#"([a-zA-Z0-9:_\-]+)=["']([^"']+)["']"#).unwrap());
+static ATTR_RE: Lazy<Regex> =
+    Lazy::new(|| Regex::new(r#"([a-zA-Z0-9:_\-]+)=["']([^"']+)["']"#).unwrap());

 #[derive(Debug, Deserialize)]
 #[serde(rename_all = "camelCase")]
@@ -451,7 +456,8 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
        });
    };

-    let instance_text = fetch_text(&client, &instance_asset.url).context("fetch request failed for XBRL instance")?;
+    let instance_text = fetch_text(&client, &instance_asset.url)
+        .context("fetch request failed for XBRL instance")?;
    let parsed_instance = parse_xbrl_instance(&instance_text, Some(instance_asset.name.clone()));

    let mut label_by_concept = HashMap::new();
@@ -459,11 +465,9 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
    let mut source = "xbrl_instance".to_string();
    let mut parse_error = None;

-    for asset in discovered
-        .assets
-        .iter()
-        .filter(|asset| asset.is_selected && (asset.asset_type == "presentation" || asset.asset_type == "label"))
-    {
+    for asset in discovered.assets.iter().filter(|asset| {
+        asset.is_selected && (asset.asset_type == "presentation" || asset.asset_type == "label")
+    }) {
        match fetch_text(&client, &asset.url) {
            Ok(content) => {
                if asset.asset_type == "presentation" {
@@ -515,10 +519,15 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
        pack_selection.pack,
        &mut compact_model,
    )?;
-    let kpi_result = kpi_mapper::build_taxonomy_kpis(&materialized.periods, &facts, pack_selection.pack)?;
+    let kpi_result =
+        kpi_mapper::build_taxonomy_kpis(&materialized.periods, &facts, pack_selection.pack)?;
    compact_model.normalization_summary.kpi_row_count = kpi_result.rows.len();
    for warning in kpi_result.warnings {
-        if !compact_model.normalization_summary.warnings.contains(&warning) {
+        if !compact_model
+            .normalization_summary
+            .warnings
+            .contains(&warning)
+        {
            compact_model.normalization_summary.warnings.push(warning);
        }
    }
@@ -526,7 +535,11 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
        &mut compact_model.concept_mappings,
        kpi_result.mapping_assignments,
    );
-    surface_mapper::apply_mapping_assignments(&mut concepts, &mut facts, &compact_model.concept_mappings);
+    surface_mapper::apply_mapping_assignments(
+        &mut concepts,
+        &mut facts,
+        &compact_model.concept_mappings,
+    );

    let has_rows = materialized
        .statement_rows
@@ -572,7 +585,11 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
        concepts_count: concepts.len(),
        dimensions_count: facts
            .iter()
-            .flat_map(|fact| fact.dimensions.iter().map(|dimension| format!("{}::{}", dimension.axis, dimension.member)))
+            .flat_map(|fact| {
+                fact.dimensions
+                    .iter()
+                    .map(|dimension| format!("{}::{}", dimension.axis, dimension.member))
+            })
            .collect::<HashSet<_>>()
            .len(),
        assets: discovered.assets,
@@ -622,7 +639,10 @@ struct DiscoveredAssets {
    assets: Vec<AssetOutput>,
 }

-fn discover_filing_assets(input: &HydrateFilingRequest, client: &Client) -> Result<DiscoveredAssets> {
+fn discover_filing_assets(
+    input: &HydrateFilingRequest,
+    client: &Client,
+) -> Result<DiscoveredAssets> {
    let Some(directory_url) = resolve_filing_directory_url(
        input.filing_url.as_deref(),
        &input.cik,
@@ -631,12 +651,19 @@ fn discover_filing_assets(input: &HydrateFilingRequest, client: &Client) -> Resu
        return Ok(DiscoveredAssets { assets: vec![] });
    };

-    let payload = fetch_json::<FilingDirectoryPayload>(client, &format!("{directory_url}index.json")).ok();
+    let payload =
+        fetch_json::<FilingDirectoryPayload>(client, &format!("{directory_url}index.json")).ok();
    let mut discovered = Vec::new();

-    if let Some(items) = payload.and_then(|payload| payload.directory.and_then(|directory| directory.item)) {
+    if let Some(items) =
+        payload.and_then(|payload| payload.directory.and_then(|directory| directory.item))
+    {
        for item in items {
-            let Some(name) = item.name.map(|name| name.trim().to_string()).filter(|name| !name.is_empty()) else {
+            let Some(name) = item
+                .name
+                .map(|name| name.trim().to_string())
+                .filter(|name| !name.is_empty())
+            else {
                continue;
            };

@@ -683,12 +710,19 @@ fn discover_filing_assets(input: &HydrateFilingRequest, client: &Client) -> Resu
                score_instance(&asset.name, input.primary_document.as_deref()),
            )
        })
-        .max_by(|left, right| left.1.partial_cmp(&right.1).unwrap_or(std::cmp::Ordering::Equal))
+        .max_by(|left, right| {
+            left.1
+                .partial_cmp(&right.1)
+                .unwrap_or(std::cmp::Ordering::Equal)
+        })
        .map(|entry| entry.0);

    for asset in &mut discovered {
        asset.score = if asset.asset_type == "instance" {
-            Some(score_instance(&asset.name, input.primary_document.as_deref()))
+            Some(score_instance(
+                &asset.name,
+                input.primary_document.as_deref(),
+            ))
        } else if asset.asset_type == "pdf" {
            Some(score_pdf(&asset.name, asset.size_bytes))
        } else {
@@ -708,7 +742,11 @@ fn discover_filing_assets(input: &HydrateFilingRequest, client: &Client) -> Resu
    Ok(DiscoveredAssets { assets: discovered })
 }

-fn resolve_filing_directory_url(filing_url: Option<&str>, cik: &str, accession_number: &str) -> Option<String> {
+fn resolve_filing_directory_url(
+    filing_url: Option<&str>,
+    cik: &str,
+    accession_number: &str,
+) -> Option<String> {
    if let Some(filing_url) = filing_url.map(str::trim).filter(|value| !value.is_empty()) {
        if let Some(last_slash) = filing_url.rfind('/') {
            if last_slash > "https://".len() {
@@ -725,7 +763,10 @@ fn resolve_filing_directory_url(filing_url: Option<&str>, cik: &str, accession_n
 }

 fn normalize_cik_for_path(value: &str) -> Option<String> {
-    let digits = value.chars().filter(|char| char.is_ascii_digit()).collect::<String>();
+    let digits = value
+        .chars()
+        .filter(|char| char.is_ascii_digit())
+        .collect::<String>();
    if digits.is_empty() {
        return None;
    }
@@ -741,16 +782,25 @@ fn classify_asset_type(name: &str) -> &'static str {
        return "schema";
    }
    if lower.ends_with(".xml") {
-        if lower.ends_with("_pre.xml") || lower.ends_with("-pre.xml") || lower.contains("presentation") {
+        if lower.ends_with("_pre.xml")
+            || lower.ends_with("-pre.xml")
+            || lower.contains("presentation")
+        {
            return "presentation";
        }
        if lower.ends_with("_lab.xml") || lower.ends_with("-lab.xml") || lower.contains("label") {
            return "label";
        }
-        if lower.ends_with("_cal.xml") || lower.ends_with("-cal.xml") || lower.contains("calculation") {
+        if lower.ends_with("_cal.xml")
+            || lower.ends_with("-cal.xml")
+            || lower.contains("calculation")
+        {
            return "calculation";
        }
-        if lower.ends_with("_def.xml") || lower.ends_with("-def.xml") || lower.contains("definition") {
+        if lower.ends_with("_def.xml")
+            || lower.ends_with("-def.xml")
+            || lower.contains("definition")
+        {
            return "definition";
        }
        return "instance";
@@ -779,7 +829,11 @@ fn score_instance(name: &str, primary_document: Option<&str>) -> f64 {
            score += 5.0;
        }
    }
-    if lower.contains("cal") || lower.contains("def") || lower.contains("lab") || lower.contains("pre") {
+    if lower.contains("cal")
+        || lower.contains("def")
+        || lower.contains("lab")
+        || lower.contains("pre")
+    {
        score -= 3.0;
    }
    score
@@ -819,7 +873,9 @@ fn fetch_text(client: &Client, url: &str) -> Result<String> {
    if !response.status().is_success() {
        return Err(anyhow!("request failed for {url} ({})", response.status()));
    }
-    response.text().with_context(|| format!("unable to read response body for {url}"))
+    response
+        .text()
+        .with_context(|| format!("unable to read response body for {url}"))
 }

 fn fetch_json<T: for<'de> Deserialize<'de>>(client: &Client, url: &str) -> Result<T> {
@@ -847,17 +903,36 @@ fn parse_xbrl_instance(raw: &str, source_file: Option<String>) -> ParsedInstance
    let mut facts = Vec::new();

    for captures in FACT_RE.captures_iter(raw) {
-        let prefix = captures.get(1).map(|value| value.as_str().trim()).unwrap_or_default();
-        let local_name = captures.get(2).map(|value| value.as_str().trim()).unwrap_or_default();
-        let attrs = captures.get(3).map(|value| value.as_str()).unwrap_or_default();
-        let body = decode_xml_entities(captures.get(4).map(|value| value.as_str()).unwrap_or_default().trim());
+        let prefix = captures
+            .get(1)
+            .map(|value| value.as_str().trim())
+            .unwrap_or_default();
+        let local_name = captures
+            .get(2)
+            .map(|value| value.as_str().trim())
+            .unwrap_or_default();
+        let attrs = captures
+            .get(3)
+            .map(|value| value.as_str())
+            .unwrap_or_default();
+        let body = decode_xml_entities(
+            captures
+                .get(4)
+                .map(|value| value.as_str())
+                .unwrap_or_default()
+                .trim(),
+        );

        if prefix.is_empty() || local_name.is_empty() || is_xbrl_infrastructure_prefix(prefix) {
            continue;
        }

        let attr_map = parse_attrs(attrs);
-        let Some(context_id) = attr_map.get("contextRef").cloned().or_else(|| attr_map.get("contextref").cloned()) else {
+        let Some(context_id) = attr_map
+            .get("contextRef")
+            .cloned()
+            .or_else(|| attr_map.get("contextref").cloned())
+        else {
            continue;
        };

@@ -870,7 +945,10 @@ fn parse_xbrl_instance(raw: &str, source_file: Option<String>) -> ParsedInstance
            .cloned()
            .unwrap_or_else(|| format!("urn:unknown:{prefix}"));
        let context = context_by_id.get(&context_id);
-        let unit_ref = attr_map.get("unitRef").cloned().or_else(|| attr_map.get("unitref").cloned());
+        let unit_ref = attr_map
+            .get("unitRef")
+            .cloned()
+            .or_else(|| attr_map.get("unitref").cloned());
        let unit = unit_ref
            .as_ref()
            .and_then(|unit_ref| unit_by_id.get(unit_ref))
@@ -896,8 +974,12 @@ fn parse_xbrl_instance(raw: &str, source_file: Option<String>) -> ParsedInstance
            period_start: context.and_then(|value| value.period_start.clone()),
            period_end: context.and_then(|value| value.period_end.clone()),
            period_instant: context.and_then(|value| value.period_instant.clone()),
-            dimensions: context.map(|value| value.dimensions.clone()).unwrap_or_default(),
-            is_dimensionless: context.map(|value| value.dimensions.is_empty()).unwrap_or(true),
+            dimensions: context
+                .map(|value| value.dimensions.clone())
+                .unwrap_or_default(),
+            is_dimensionless: context
+                .map(|value| value.dimensions.is_empty())
+                .unwrap_or(true),
            source_file: source_file.clone(),
        });
    }
@@ -916,10 +998,7 @@ fn parse_xbrl_instance(raw: &str, source_file: Option<String>) -> ParsedInstance
        })
        .collect::<Vec<_>>();

-    ParsedInstance {
-        contexts,
-        facts,
-    }
+    ParsedInstance { contexts, facts }
 }

 fn parse_namespace_map(raw: &str, root_tag_hint: &str) -> HashMap<String, String> {
@@ -935,7 +1014,10 @@ fn parse_namespace_map(raw: &str, root_tag_hint: &str) -> HashMap<String, String
        .captures_iter(&root_start)
    {
        if let (Some(prefix), Some(uri)) = (captures.get(1), captures.get(2)) {
-            map.insert(prefix.as_str().trim().to_string(), uri.as_str().trim().to_string());
+            map.insert(
+                prefix.as_str().trim().to_string(),
+                uri.as_str().trim().to_string(),
+            );
        }
    }

@@ -946,16 +1028,26 @@ fn parse_contexts(raw: &str) -> HashMap<String, ParsedContext> {
    let mut contexts = HashMap::new();

    for captures in CONTEXT_RE.captures_iter(raw) {
-        let Some(context_id) = captures.get(1).map(|value| value.as_str().trim().to_string()) else {
+        let Some(context_id) = captures
+            .get(1)
+            .map(|value| value.as_str().trim().to_string())
+        else {
            continue;
        };
-        let block = captures.get(2).map(|value| value.as_str()).unwrap_or_default();
+        let block = captures
+            .get(2)
+            .map(|value| value.as_str())
+            .unwrap_or_default();
        let (entity_identifier, entity_scheme) = IDENTIFIER_RE
            .captures(block)
            .map(|captures| {
                (
-                    captures.get(2).map(|value| decode_xml_entities(value.as_str().trim())),
-                    captures.get(1).map(|value| decode_xml_entities(value.as_str().trim())),
+                    captures
+                        .get(2)
+                        .map(|value| decode_xml_entities(value.as_str().trim())),
+                    captures
+                        .get(1)
+                        .map(|value| decode_xml_entities(value.as_str().trim())),
                )
            })
            .unwrap_or((None, None));
@@ -984,7 +1076,10 @@ fn parse_contexts(raw: &str) -> HashMap<String, ParsedContext> {

        let mut dimensions = Vec::new();
        if let Some(segment_value) = segment.as_ref() {
-            if let Some(members) = segment_value.get("explicitMembers").and_then(|value| value.as_array()) {
+            if let Some(members) = segment_value
+                .get("explicitMembers")
+                .and_then(|value| value.as_array())
+            {
                for member in members {
                    if let (Some(axis), Some(member_value)) = (
                        member.get("axis").and_then(|value| value.as_str()),
@@ -999,7 +1094,10 @@ fn parse_contexts(raw: &str) -> HashMap<String, ParsedContext> {
            }
        }
        if let Some(scenario_value) = scenario.as_ref() {
-            if let Some(members) = scenario_value.get("explicitMembers").and_then(|value| value.as_array()) {
+            if let Some(members) = scenario_value
+                .get("explicitMembers")
+                .and_then(|value| value.as_array())
+            {
                for member in members {
                    if let (Some(axis), Some(member_value)) = (
                        member.get("axis").and_then(|value| value.as_str()),
@@ -1062,10 +1160,16 @@ fn parse_dimension_container(raw: &str) -> serde_json::Value {
 fn parse_units(raw: &str) -> HashMap<String, ParsedUnit> {
    let mut units = HashMap::new();
    for captures in UNIT_RE.captures_iter(raw) {
-        let Some(id) = captures.get(1).map(|value| value.as_str().trim().to_string()) else {
+        let Some(id) = captures
+            .get(1)
+            .map(|value| value.as_str().trim().to_string())
+        else {
            continue;
        };
-        let block = captures.get(2).map(|value| value.as_str()).unwrap_or_default();
+        let block = captures
+            .get(2)
+            .map(|value| value.as_str())
+            .unwrap_or_default();
        let measures = MEASURE_RE
            .captures_iter(block)
            .filter_map(|captures| captures.get(1))
@@ -1097,7 +1201,10 @@ fn parse_attrs(raw: &str) -> HashMap<String, String> {
    let mut map = HashMap::new();
    for captures in ATTR_RE.captures_iter(raw) {
        if let (Some(name), Some(value)) = (captures.get(1), captures.get(2)) {
-            map.insert(name.as_str().to_string(), decode_xml_entities(value.as_str()));
+            map.insert(
+                name.as_str().to_string(),
+                decode_xml_entities(value.as_str()),
+            );
        }
    }
    map
@@ -1138,12 +1245,20 @@ fn parse_label_linkbase(raw: &str) -> HashMap<String, String> {
    let mut preferred = HashMap::<String, (String, i64)>::new();

    for captures in LABEL_LINK_RE.captures_iter(raw) {
-        let block = captures.get(1).map(|value| value.as_str()).unwrap_or_default();
+        let block = captures
+            .get(1)
+            .map(|value| value.as_str())
+            .unwrap_or_default();
        let mut loc_by_label = HashMap::<String, String>::new();
        let mut resource_by_label = HashMap::<String, (String, Option<String>)>::new();

        for captures in LOC_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
+                captures
+                    .get(1)
+                    .map(|value| value.as_str())
+                    .unwrap_or_default(),
+            );
            let Some(label) = attrs.get("xlink:label").cloned() else {
                continue;
            };
@@ -1160,11 +1275,21 @@ fn parse_label_linkbase(raw: &str) -> HashMap<String, String> {
        }

        for captures in LABEL_RESOURCE_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
+                captures
+                    .get(1)
+                    .map(|value| value.as_str())
+                    .unwrap_or_default(),
+            );
            let Some(label) = attrs.get("xlink:label").cloned() else {
                continue;
            };
-            let body = decode_xml_entities(captures.get(2).map(|value| value.as_str()).unwrap_or_default())
+            let body = decode_xml_entities(
+                captures
+                    .get(2)
+                    .map(|value| value.as_str())
+                    .unwrap_or_default(),
+            )
            .split_whitespace()
            .collect::<Vec<_>>()
            .join(" ");
@@ -1175,7 +1300,12 @@ fn parse_label_linkbase(raw: &str) -> HashMap<String, String> {
        }

        for captures in LABEL_ARC_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
+                captures
+                    .get(1)
+                    .map(|value| value.as_str())
+                    .unwrap_or_default(),
+            );
            let Some(from) = attrs.get("xlink:from").cloned() else {
                continue;
            };
@@ -1190,7 +1320,11 @@ fn parse_label_linkbase(raw: &str) -> HashMap<String, String> {
            };
            let priority = label_priority(role.as_deref());
            let current = preferred.get(concept_key).cloned();
-            if current.as_ref().map(|(_, current_priority)| priority > *current_priority).unwrap_or(true) {
+            if current
+                .as_ref()
+                .map(|(_, current_priority)| priority > *current_priority)
+                .unwrap_or(true)
+            {
                preferred.insert(concept_key.clone(), (label.clone(), priority));
            }
        }
@@ -1207,18 +1341,31 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
    let mut rows = Vec::new();

    for captures in PRESENTATION_LINK_RE.captures_iter(raw) {
-        let link_attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+        let link_attrs = parse_attrs(
+            captures
+                .get(1)
+                .map(|value| value.as_str())
+                .unwrap_or_default(),
+        );
        let Some(role_uri) = link_attrs.get("xlink:role").cloned() else {
            continue;
        };
-        let block = captures.get(2).map(|value| value.as_str()).unwrap_or_default();
+        let block = captures
+            .get(2)
+            .map(|value| value.as_str())
+            .unwrap_or_default();
        let mut loc_by_label = HashMap::<String, (String, String, bool)>::new();
        let mut children_by_label = HashMap::<String, Vec<(String, f64)>>::new();
        let mut incoming = HashSet::<String>::new();
        let mut all_referenced = HashSet::<String>::new();

        for captures in LOC_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
+                captures
+                    .get(1)
+                    .map(|value| value.as_str())
+                    .unwrap_or_default(),
+            );
            let Some(label) = attrs.get("xlink:label").cloned() else {
                continue;
            };
@@ -1228,14 +1375,27 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
            let Some(qname) = qname_from_href(&href) else {
                continue;
            };
-            let Some((concept_key, qname, local_name)) = concept_from_qname(&qname, &namespaces) else {
+            let Some((concept_key, qname, local_name)) = concept_from_qname(&qname, &namespaces)
+            else {
                continue;
            };
-            loc_by_label.insert(label, (concept_key, qname, local_name.to_ascii_lowercase().contains("abstract")));
+            loc_by_label.insert(
+                label,
+                (
+                    concept_key,
+                    qname,
+                    local_name.to_ascii_lowercase().contains("abstract"),
+                ),
+            );
        }

        for captures in PRESENTATION_ARC_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
+                captures
+                    .get(1)
+                    .map(|value| value.as_str())
+                    .unwrap_or_default(),
+            );
            let Some(from) = attrs.get("xlink:from").cloned() else {
                continue;
            };
@@ -1248,8 +1408,16 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
            let order = attrs
                .get("order")
                .and_then(|value| value.parse::<f64>().ok())
-                .unwrap_or_else(|| children_by_label.get(&from).map(|children| children.len() as f64 + 1.0).unwrap_or(1.0));
-            children_by_label.entry(from.clone()).or_default().push((to.clone(), order));
+                .unwrap_or_else(|| {
+                    children_by_label
+                        .get(&from)
+                        .map(|children| children.len() as f64 + 1.0)
+                        .unwrap_or(1.0)
+                });
+            children_by_label
+                .entry(from.clone())
+                .or_default()
+                .push((to.clone(), order));
            incoming.insert(to.clone());
            all_referenced.insert(from);
            all_referenced.insert(to);
@@ -1281,7 +1449,11 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
                return;
            }

-            let parent_concept_key = parent_label.and_then(|parent| loc_by_label.get(parent).map(|(concept_key, _, _)| concept_key.clone()));
+            let parent_concept_key = parent_label.and_then(|parent| {
+                loc_by_label
+                    .get(parent)
+                    .map(|(concept_key, _, _)| concept_key.clone())
+            });
            rows.push(PresentationNode {
                concept_key: concept_key.clone(),
                role_uri: role_uri.to_string(),
@@ -1292,7 +1464,11 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
            });

            let mut children = children_by_label.get(label).cloned().unwrap_or_default();
-            children.sort_by(|left, right| left.1.partial_cmp(&right.1).unwrap_or(std::cmp::Ordering::Equal));
+            children.sort_by(|left, right| {
+                left.1
+                    .partial_cmp(&right.1)
+                    .unwrap_or(std::cmp::Ordering::Equal)
+            });
            for (index, (child_label, _)) in children.into_iter().enumerate() {
                dfs(
                    &child_label,
@@ -1400,7 +1576,10 @@ fn materialize_taxonomy_statements(
            .clone()
            .or_else(|| fact.period_instant.clone())
            .unwrap_or_else(|| filing_date.to_string());
-        let id = format!("{date}-{compact_accession}-{}", period_by_signature.len() + 1);
+        let id = format!(
+            "{date}-{compact_accession}-{}",
+            period_by_signature.len() + 1
+        );
        let period_label = if fact.period_instant.is_some() && fact.period_start.is_none() {
            "Instant".to_string()
        } else if fact.period_start.is_some() && fact.period_end.is_some() {
@@ -1420,7 +1599,10 @@ fn materialize_taxonomy_statements(
                accession_number: accession_number.to_string(),
                filing_date: filing_date.to_string(),
                period_start: fact.period_start.clone(),
-                period_end: fact.period_end.clone().or_else(|| fact.period_instant.clone()),
+                period_end: fact
+                    .period_end
+                    .clone()
+                    .or_else(|| fact.period_instant.clone()),
                filing_type: filing_type.to_string(),
                period_label,
            },
@@ -1429,9 +1611,17 @@ fn materialize_taxonomy_statements(

    let mut periods = period_by_signature.values().cloned().collect::<Vec<_>>();
    periods.sort_by(|left, right| {
-        let left_key = left.period_end.clone().unwrap_or_else(|| left.filing_date.clone());
-        let right_key = right.period_end.clone().unwrap_or_else(|| right.filing_date.clone());
-        left_key.cmp(&right_key).then_with(|| left.id.cmp(&right.id))
+        let left_key = left
+            .period_end
+            .clone()
+            .unwrap_or_else(|| left.filing_date.clone());
+        let right_key = right
+            .period_end
+            .clone()
+            .unwrap_or_else(|| right.filing_date.clone());
+        left_key
+            .cmp(&right_key)
+            .then_with(|| left.id.cmp(&right.id))
    });
    let period_id_by_signature = period_by_signature
        .iter()
@@ -1440,7 +1630,10 @@ fn materialize_taxonomy_statements(

    let mut presentation_by_concept = HashMap::<String, Vec<&PresentationNode>>::new();
    for node in presentation {
-        presentation_by_concept.entry(node.concept_key.clone()).or_default().push(node);
+        presentation_by_concept
+            .entry(node.concept_key.clone())
+            .or_default()
+            .push(node);
    }

    let mut grouped_by_statement = empty_parsed_fact_map();
@@ -1502,9 +1695,13 @@ fn materialize_taxonomy_statements(
    let mut concepts = Vec::<ConceptOutput>::new();

    for statement_kind in statement_keys() {
-        let concept_groups = grouped_by_statement.remove(statement_kind).unwrap_or_default();
+        let concept_groups = grouped_by_statement
+            .remove(statement_kind)
+            .unwrap_or_default();
        let mut concept_keys = HashSet::<String>::new();
-        for node in presentation.iter().filter(|node| classify_statement_role(&node.role_uri).as_deref() == Some(statement_kind)) {
+        for node in presentation.iter().filter(|node| {
+            classify_statement_role(&node.role_uri).as_deref() == Some(statement_kind)
+        }) {
            concept_keys.insert(node.concept_key.clone());
        }
        for concept_key in concept_groups.keys() {
@@ -1516,12 +1713,21 @@ fn materialize_taxonomy_statements(
            .map(|concept_key| {
                let nodes = presentation
                    .iter()
-                    .filter(|node| node.concept_key == concept_key && classify_statement_role(&node.role_uri).as_deref() == Some(statement_kind))
+                    .filter(|node| {
+                        node.concept_key == concept_key
+                            && classify_statement_role(&node.role_uri).as_deref()
+                                == Some(statement_kind)
+                    })
                    .collect::<Vec<_>>();
-                let order = nodes.iter().map(|node| node.order).fold(f64::INFINITY, f64::min);
+                let order = nodes
+                    .iter()
+                    .map(|node| node.order)
+                    .fold(f64::INFINITY, f64::min);
                let depth = nodes.iter().map(|node| node.depth).min().unwrap_or(0);
                let role_uri = nodes.first().map(|node| node.role_uri.clone());
-                let parent_concept_key = nodes.first().and_then(|node| node.parent_concept_key.clone());
+                let parent_concept_key = nodes
+                    .first()
+                    .and_then(|node| node.parent_concept_key.clone());
                (concept_key, order, depth, role_uri, parent_concept_key)
            })
            .collect::<Vec<_>>();
@@ -1532,8 +1738,13 @@ fn materialize_taxonomy_statements(
                .then_with(|| left.0.cmp(&right.0))
        });

-        for (concept_key, presentation_order, depth, role_uri, parent_concept_key) in ordered_concepts {
-            let fact_group = concept_groups.get(&concept_key).cloned().unwrap_or_default();
+        for (concept_key, presentation_order, depth, role_uri, parent_concept_key) in
+            ordered_concepts
+        {
+            let fact_group = concept_groups
+                .get(&concept_key)
+                .cloned()
+                .unwrap_or_default();
            let (namespace_uri, local_name) = split_concept_key(&concept_key);
            let qname = fact_group
                .first()
@@ -1672,7 +1883,13 @@ fn empty_detail_row_map() -> DetailRowStatementMap {
 }

 fn statement_keys() -> [&'static str; 5] {
-    ["income", "balance", "cash_flow", "equity", "comprehensive_income"]
+    [
+        "income",
+        "balance",
+        "cash_flow",
+        "equity",
+        "comprehensive_income",
+    ]
 }

 fn statement_key_ref(value: &str) -> Option<&'static str> {
@@ -1709,7 +1926,13 @@ fn pick_preferred_fact(grouped_facts: &[(i64, ParsedFact)]) -> Option<&(i64, Par
                    .unwrap_or_default();
                left_date.cmp(&right_date)
            })
-            .then_with(|| left.1.value.abs().partial_cmp(&right.1.value.abs()).unwrap_or(std::cmp::Ordering::Equal))
+            .then_with(|| {
+                left.1
+                    .value
+                    .abs()
+                    .partial_cmp(&right.1.value.abs())
+                    .unwrap_or(std::cmp::Ordering::Equal)
+            })
    })
 }

@@ -1779,12 +2002,6 @@ fn classify_statement_role(role_uri: &str) -> Option<String> {

 fn concept_statement_fallback(local_name: &str) -> Option<String> {
    let normalized = local_name.to_ascii_lowercase();
-    if Regex::new(r#"cash|operatingactivities|investingactivities|financingactivities"#)
-        .unwrap()
-        .is_match(&normalized)
-    {
-        return Some("cash_flow".to_string());
-    }
    if Regex::new(r#"equity|retainedearnings|additionalpaidincapital"#)
        .unwrap()
        .is_match(&normalized)
@@ -1794,6 +2011,22 @@ fn concept_statement_fallback(local_name: &str) -> Option<String> {
    if normalized.contains("comprehensiveincome") {
        return Some("comprehensive_income".to_string());
    }
+    if Regex::new(
+        r#"deferredpolicyacquisitioncosts(andvalueofbusinessacquired)?$|supplementaryinsuranceinformationdeferredpolicyacquisitioncosts$|deferredacquisitioncosts$"#,
+    )
+    .unwrap()
+    .is_match(&normalized)
+    {
+        return Some("balance".to_string());
+    }
+    if Regex::new(
+        r#"netcashprovidedbyusedin.*activities|increasedecreasein|paymentstoacquire|paymentsforcapitalimprovements$|paymentsfordepositsonrealestateacquisitions$|paymentsforrepurchase|paymentsofdividends|dividendscommonstockcash$|proceedsfrom|repaymentsofdebt|sharebasedcompensation$|allocatedsharebasedcompensationexpense$|depreciationdepletionandamortization$|depreciationamortizationandaccretionnet$|depreciationandamortization$|depreciationamortizationandother$|otheradjustmentstoreconcilenetincomelosstocashprovidedbyusedinoperatingactivities"#,
+    )
+    .unwrap()
+    .is_match(&normalized)
+    {
+        return Some("cash_flow".to_string());
+    }
    if Regex::new(
        r#"asset|liabilit|debt|financingreceivable|loansreceivable|deposits|allowanceforcreditloss|futurepolicybenefits|policyholderaccountbalances|unearnedpremiums|realestateinvestmentproperty|grossatcarryingvalue|investmentproperty"#,
    )
@@ -1967,7 +2200,10 @@ mod tests {
            vec![],
        )
        .expect("core pack should load and map");
-        let income_surface_rows = model.surface_rows.get("income").expect("income surface rows");
+        let income_surface_rows = model
+            .surface_rows
+            .get("income")
+            .expect("income surface rows");
        let op_expenses = income_surface_rows
            .iter()
            .find(|row| row.key == "operating_expenses")
@@ -1978,7 +2214,10 @@ mod tests {
            .expect("revenue surface row");

        assert_eq!(revenue.values.get("2025").copied().flatten(), Some(120.0));
-        assert_eq!(op_expenses.values.get("2024").copied().flatten(), Some(40.0));
+        assert_eq!(
+            op_expenses.values.get("2024").copied().flatten(),
+            Some(40.0)
+        );
        assert_eq!(op_expenses.detail_count, Some(2));

        let operating_expense_details = model
@@ -1987,8 +2226,12 @@ mod tests {
            .and_then(|groups| groups.get("operating_expenses"))
            .expect("operating expenses details");
        assert_eq!(operating_expense_details.len(), 2);
-        assert!(operating_expense_details.iter().any(|row| row.key == "sga-row"));
-        assert!(operating_expense_details.iter().any(|row| row.key == "rd-row"));
+        assert!(operating_expense_details
+            .iter()
+            .any(|row| row.key == "sga-row"));
+        assert!(operating_expense_details
+            .iter()
+            .any(|row| row.key == "rd-row"));

        let residual_rows = model
            .detail_rows
@@ -2003,17 +2246,26 @@ mod tests {
            .concept_mappings
            .get("http://fasb.org/us-gaap/2024#ResearchAndDevelopmentExpense")
            .expect("rd mapping");
-        assert_eq!(rd_mapping.detail_parent_surface_key.as_deref(), Some("operating_expenses"));
-        assert_eq!(rd_mapping.surface_key.as_deref(), Some("operating_expenses"));
+        assert_eq!(
+            rd_mapping.detail_parent_surface_key.as_deref(),
+            Some("operating_expenses")
+        );
+        assert_eq!(
+            rd_mapping.surface_key.as_deref(),
+            Some("operating_expenses")
+        );

        let residual_mapping = model
            .concept_mappings
            .get("urn:company#OtherOperatingCharges")
            .expect("residual mapping");
        assert!(residual_mapping.residual_flag);
-        assert_eq!(residual_mapping.detail_parent_surface_key.as_deref(), Some("unmapped"));
+        assert_eq!(
+            residual_mapping.detail_parent_surface_key.as_deref(),
+            Some("unmapped")
+        );

-        assert_eq!(model.normalization_summary.surface_row_count, 5);
+        assert_eq!(model.normalization_summary.surface_row_count, 6);
        assert_eq!(model.normalization_summary.detail_row_count, 3);
        assert_eq!(model.normalization_summary.unmapped_row_count, 1);
    }
@@ -2051,18 +2303,60 @@ mod tests {
    #[test]
    fn classifies_pack_specific_concepts_without_presentation_roles() {
        assert_eq!(
-            concept_statement_fallback("FinancingReceivableExcludingAccruedInterestAfterAllowanceForCreditLoss")
+            concept_statement_fallback(
+                "FinancingReceivableExcludingAccruedInterestAfterAllowanceForCreditLoss"
+            )
            .as_deref(),
            Some("balance")
        );
-        assert_eq!(concept_statement_fallback("Deposits").as_deref(), Some("balance"));
+        assert_eq!(
+            concept_statement_fallback("Deposits").as_deref(),
+            Some("balance")
+        );
        assert_eq!(
            concept_statement_fallback("RealEstateInvestmentPropertyNet").as_deref(),
            Some("balance")
        );
-        assert_eq!(concept_statement_fallback("LeaseIncome").as_deref(), Some("income"));
        assert_eq!(
-            concept_statement_fallback("DirectCostsOfLeasedAndRentedPropertyOrEquipment").as_deref(),
+            concept_statement_fallback("DeferredPolicyAcquisitionCosts").as_deref(),
+            Some("balance")
+        );
+        assert_eq!(
+            concept_statement_fallback("DeferredPolicyAcquisitionCostsAndValueOfBusinessAcquired")
+                .as_deref(),
+            Some("balance")
+        );
+        assert_eq!(
+            concept_statement_fallback("IncreaseDecreaseInAccountsReceivable").as_deref(),
+            Some("cash_flow")
+        );
+        assert_eq!(
+            concept_statement_fallback("PaymentsOfDividends").as_deref(),
+            Some("cash_flow")
+        );
+        assert_eq!(
+            concept_statement_fallback("RepaymentsOfDebt").as_deref(),
+            Some("cash_flow")
+        );
+        assert_eq!(
+            concept_statement_fallback("ShareBasedCompensation").as_deref(),
+            Some("cash_flow")
+        );
+        assert_eq!(
+            concept_statement_fallback("PaymentsForCapitalImprovements").as_deref(),
+            Some("cash_flow")
+        );
+        assert_eq!(
+            concept_statement_fallback("PaymentsForDepositsOnRealEstateAcquisitions").as_deref(),
+            Some("cash_flow")
+        );
+        assert_eq!(
+            concept_statement_fallback("LeaseIncome").as_deref(),
+            Some("income")
+        );
+        assert_eq!(
+            concept_statement_fallback("DirectCostsOfLeasedAndRentedPropertyOrEquipment")
+                .as_deref(),
            Some("income")
        );
    }
--- a/rust/fiscal-xbrl-core/src/surface_mapper.rs
+++ b/rust/fiscal-xbrl-core/src/surface_mapper.rs
--- a/rust/fiscal-xbrl-core/src/taxonomy_loader.rs
+++ b/rust/fiscal-xbrl-core/src/taxonomy_loader.rs
@@ -1,12 +1,22 @@
 use anyhow::{anyhow, Context, Result};
 use serde::Deserialize;
+use std::collections::HashMap;
 use std::env;
 use std::fs;
-use std::collections::HashMap;
 use std::path::PathBuf;

 use crate::pack_selector::FiscalPack;

+fn default_include_in_output() -> bool {
+    true
+}
+
+#[derive(Debug, Deserialize, Clone, Copy, PartialEq, Eq)]
+#[serde(rename_all = "snake_case")]
+pub enum SurfaceSignTransform {
+    Invert,
+}
+
 #[derive(Debug, Deserialize, Clone)]
 pub struct SurfacePackFile {
    pub version: String,
@@ -25,9 +35,44 @@ pub struct SurfaceDefinition {
    pub rollup_policy: String,
    pub allowed_source_concepts: Vec<String>,
    pub allowed_authoritative_concepts: Vec<String>,
-    pub formula_fallback: Option<serde_json::Value>,
+    pub formula_fallback: Option<SurfaceFormulaFallback>,
    pub detail_grouping_policy: String,
    pub materiality_policy: String,
+    #[serde(default = "default_include_in_output")]
+    pub include_in_output: bool,
+    #[serde(default)]
+    pub sign_transform: Option<SurfaceSignTransform>,
+}
+
+#[derive(Debug, Deserialize, Clone)]
+#[serde(untagged)]
+pub enum SurfaceFormulaFallback {
+    LegacyString(#[allow(dead_code)] String),
+    Structured(SurfaceFormula),
+}
+
+impl SurfaceFormulaFallback {
+    pub fn structured(&self) -> Option<&SurfaceFormula> {
+        match self {
+            Self::Structured(formula) => Some(formula),
+            Self::LegacyString(_) => None,
+        }
+    }
+}
+
+#[derive(Debug, Deserialize, Clone)]
+pub struct SurfaceFormula {
+    pub op: SurfaceFormulaOp,
+    pub sources: Vec<String>,
+    #[serde(default)]
+    pub treat_null_as_zero: bool,
+}
+
+#[derive(Debug, Deserialize, Clone, Copy, PartialEq, Eq)]
+#[serde(rename_all = "snake_case")]
+pub enum SurfaceFormulaOp {
+    Sum,
+    Subtract,
 }

 #[derive(Debug, Deserialize, Clone)]
@@ -147,7 +192,9 @@ pub fn resolve_taxonomy_dir() -> Result<PathBuf> {
    candidates
        .into_iter()
        .find(|path| path.is_dir())
-        .ok_or_else(|| anyhow!("taxonomy resolution failed: unable to locate runtime taxonomy directory"))
+        .ok_or_else(|| {
+            anyhow!("taxonomy resolution failed: unable to locate runtime taxonomy directory")
+        })
 }

 pub fn load_surface_pack(pack: FiscalPack) -> Result<SurfacePackFile> {
@@ -156,14 +203,52 @@ pub fn load_surface_pack(pack: FiscalPack) -> Result<SurfacePackFile> {
        .join("fiscal")
        .join("v1")
        .join(format!("{}.surface.json", pack.as_str()));
-    let raw = fs::read_to_string(&path)
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
-    let file = serde_json::from_str::<SurfacePackFile>(&raw)
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+    let mut file = load_surface_pack_file(&path)?;
+
+    if !matches!(pack, FiscalPack::Core) {
+        let core_path = taxonomy_dir
+            .join("fiscal")
+            .join("v1")
+            .join("core.surface.json");
+        let core_file = load_surface_pack_file(&core_path)?;
+        let pack_inherited_keys = file
+            .surfaces
+            .iter()
+            .filter(|surface| surface.statement == "balance" || surface.statement == "cash_flow")
+            .map(|surface| (surface.statement.clone(), surface.surface_key.clone()))
+            .collect::<std::collections::HashSet<_>>();
+
+        file.surfaces.extend(
+            core_file
+                .surfaces
+                .into_iter()
+                .filter(|surface| surface.statement == "balance" || surface.statement == "cash_flow")
+                .filter(|surface| {
+                    !pack_inherited_keys
+                        .contains(&(surface.statement.clone(), surface.surface_key.clone()))
+                }),
+        );
+    }
+
    let _ = (&file.version, &file.pack);
    Ok(file)
 }

+fn load_surface_pack_file(path: &PathBuf) -> Result<SurfacePackFile> {
+    let raw = fs::read_to_string(path).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to read {}",
+            path.display()
+        )
+    })?;
+    serde_json::from_str::<SurfacePackFile>(&raw).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to parse {}",
+            path.display()
+        )
+    })
+}
+
 pub fn load_crosswalk(regime: &str) -> Result<Option<CrosswalkFile>> {
    let file_name = match regime {
        "us-gaap" => "us-gaap.json",
@@ -173,10 +258,18 @@ pub fn load_crosswalk(regime: &str) -> Result<Option<CrosswalkFile>> {

    let taxonomy_dir = resolve_taxonomy_dir()?;
    let path = taxonomy_dir.join("crosswalk").join(file_name);
-    let raw = fs::read_to_string(&path)
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
-    let file = serde_json::from_str::<CrosswalkFile>(&raw)
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+    let raw = fs::read_to_string(&path).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to read {}",
+            path.display()
+        )
+    })?;
+    let file = serde_json::from_str::<CrosswalkFile>(&raw).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to parse {}",
+            path.display()
+        )
+    })?;
    let _ = (&file.version, &file.regime);
    Ok(Some(file))
 }
@@ -188,10 +281,18 @@ pub fn load_kpi_pack(pack: FiscalPack) -> Result<KpiPackFile> {
        .join("v1")
        .join("kpis")
        .join(format!("{}.kpis.json", pack.as_str()));
-    let raw = fs::read_to_string(&path)
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
-    let file = serde_json::from_str::<KpiPackFile>(&raw)
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+    let raw = fs::read_to_string(&path).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to read {}",
+            path.display()
+        )
+    })?;
+    let file = serde_json::from_str::<KpiPackFile>(&raw).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to parse {}",
+            path.display()
+        )
+    })?;
    let _ = (&file.version, &file.pack);
    Ok(file)
 }
@@ -202,10 +303,18 @@ pub fn load_universal_income_definitions() -> Result<UniversalIncomeFile> {
        .join("fiscal")
        .join("v1")
        .join("universal_income.surface.json");
-    let raw = fs::read_to_string(&path)
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
-    let file = serde_json::from_str::<UniversalIncomeFile>(&raw)
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+    let raw = fs::read_to_string(&path).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to read {}",
+            path.display()
+        )
+    })?;
+    let file = serde_json::from_str::<UniversalIncomeFile>(&raw).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to parse {}",
+            path.display()
+        )
+    })?;
    let _ = &file.version;
    Ok(file)
 }
@@ -216,10 +325,18 @@ pub fn load_income_bridge(pack: FiscalPack) -> Result<IncomeBridgeFile> {
        .join("fiscal")
        .join("v1")
        .join(format!("{}.income-bridge.json", pack.as_str()));
-    let raw = fs::read_to_string(&path)
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
-    let file = serde_json::from_str::<IncomeBridgeFile>(&raw)
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+    let raw = fs::read_to_string(&path).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to read {}",
+            path.display()
+        )
+    })?;
+    let file = serde_json::from_str::<IncomeBridgeFile>(&raw).with_context(|| {
+        format!(
+            "taxonomy resolution failed: unable to parse {}",
+            path.display()
+        )
+    })?;
    let _ = (&file.version, &file.pack);
    Ok(file)
 }
@@ -230,17 +347,20 @@ mod tests {

    #[test]
    fn resolves_taxonomy_dir_and_loads_core_pack() {
-        let taxonomy_dir = resolve_taxonomy_dir().expect("taxonomy dir should resolve during tests");
+        let taxonomy_dir =
+            resolve_taxonomy_dir().expect("taxonomy dir should resolve during tests");
        assert!(taxonomy_dir.exists());

-        let surface_pack = load_surface_pack(FiscalPack::Core).expect("core surface pack should load");
+        let surface_pack =
+            load_surface_pack(FiscalPack::Core).expect("core surface pack should load");
        assert_eq!(surface_pack.pack, "core");
        assert!(!surface_pack.surfaces.is_empty());

        let kpi_pack = load_kpi_pack(FiscalPack::Core).expect("core kpi pack should load");
        assert_eq!(kpi_pack.pack, "core");

-        let universal_income = load_universal_income_definitions().expect("universal income config should load");
+        let universal_income =
+            load_universal_income_definitions().expect("universal income config should load");
        assert!(!universal_income.rows.is_empty());

        let core_bridge = load_income_bridge(FiscalPack::Core).expect("core bridge should load");
--- a/rust/fiscal-xbrl-core/src/universal_income.rs
+++ b/rust/fiscal-xbrl-core/src/universal_income.rs
--- a/rust/taxonomy/fiscal/v1/bank_lender.surface.json
+++ b/rust/taxonomy/fiscal/v1/bank_lender.surface.json
@@ -156,7 +156,7 @@
      "surface_key": "loans",
      "statement": "balance",
      "label": "Loans",
-      "category": "surface",
+      "category": "noncurrent_assets",
      "order": 30,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -181,7 +181,7 @@
      "surface_key": "allowance_for_credit_losses",
      "statement": "balance",
      "label": "Allowance for Credit Losses",
-      "category": "surface",
+      "category": "noncurrent_assets",
      "order": 40,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -201,7 +201,7 @@
      "surface_key": "deposits",
      "statement": "balance",
      "label": "Deposits",
-      "category": "surface",
+      "category": "current_liabilities",
      "order": 80,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -215,7 +215,7 @@
      "surface_key": "total_assets",
      "statement": "balance",
      "label": "Total Assets",
-      "category": "surface",
+      "category": "derived",
      "order": 90,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -229,7 +229,7 @@
      "surface_key": "total_liabilities",
      "statement": "balance",
      "label": "Total Liabilities",
-      "category": "surface",
+      "category": "derived",
      "order": 100,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -243,7 +243,7 @@
      "surface_key": "total_equity",
      "statement": "balance",
      "label": "Total Equity",
-      "category": "surface",
+      "category": "equity",
      "order": 110,
      "unit": "currency",
      "rollup_policy": "direct_only",
--- a/rust/taxonomy/fiscal/v1/broker_asset_manager.surface.json
+++ b/rust/taxonomy/fiscal/v1/broker_asset_manager.surface.json
@@ -63,7 +63,7 @@
      "surface_key": "total_assets",
      "statement": "balance",
      "label": "Total Assets",
-      "category": "surface",
+      "category": "derived",
      "order": 90,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -77,7 +77,7 @@
      "surface_key": "total_liabilities",
      "statement": "balance",
      "label": "Total Liabilities",
-      "category": "surface",
+      "category": "derived",
      "order": 100,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -91,7 +91,7 @@
      "surface_key": "total_equity",
      "statement": "balance",
      "label": "Total Equity",
-      "category": "surface",
+      "category": "equity",
      "order": 110,
      "unit": "currency",
      "rollup_policy": "direct_only",
--- a/rust/taxonomy/fiscal/v1/core.surface.json
+++ b/rust/taxonomy/fiscal/v1/core.surface.json
--- a/rust/taxonomy/fiscal/v1/insurance.surface.json
+++ b/rust/taxonomy/fiscal/v1/insurance.surface.json
@@ -119,7 +119,7 @@
      "surface_key": "policy_liabilities",
      "statement": "balance",
      "label": "Policy Liabilities",
-      "category": "surface",
+      "category": "noncurrent_liabilities",
      "order": 80,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -145,17 +145,19 @@
      "surface_key": "deferred_acquisition_costs",
      "statement": "balance",
      "label": "Deferred Acquisition Costs",
-      "category": "surface",
+      "category": "noncurrent_assets",
      "order": 90,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
      "allowed_source_concepts": [
        "us-gaap:DeferredPolicyAcquisitionCosts",
-        "us-gaap:DeferredAcquisitionCosts"
+        "us-gaap:DeferredAcquisitionCosts",
+        "us-gaap:DeferredPolicyAcquisitionCostsAndValueOfBusinessAcquired"
      ],
      "allowed_authoritative_concepts": [
        "us-gaap:DeferredPolicyAcquisitionCosts",
-        "us-gaap:DeferredAcquisitionCosts"
+        "us-gaap:DeferredAcquisitionCosts",
+        "us-gaap:DeferredPolicyAcquisitionCostsAndValueOfBusinessAcquired"
      ],
      "formula_fallback": null,
      "detail_grouping_policy": "group_all_children",
@@ -165,7 +167,7 @@
      "surface_key": "total_assets",
      "statement": "balance",
      "label": "Total Assets",
-      "category": "surface",
+      "category": "derived",
      "order": 100,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -179,7 +181,7 @@
      "surface_key": "total_liabilities",
      "statement": "balance",
      "label": "Total Liabilities",
-      "category": "surface",
+      "category": "derived",
      "order": 110,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -193,7 +195,7 @@
      "surface_key": "total_equity",
      "statement": "balance",
      "label": "Total Equity",
-      "category": "surface",
+      "category": "equity",
      "order": 120,
      "unit": "currency",
      "rollup_policy": "direct_only",
--- a/rust/taxonomy/fiscal/v1/reit_real_estate.surface.json
+++ b/rust/taxonomy/fiscal/v1/reit_real_estate.surface.json
@@ -78,7 +78,7 @@
      "surface_key": "investment_property",
      "statement": "balance",
      "label": "Investment Property",
-      "category": "surface",
+      "category": "noncurrent_assets",
      "order": 40,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -99,7 +99,7 @@
      "surface_key": "total_assets",
      "statement": "balance",
      "label": "Total Assets",
-      "category": "surface",
+      "category": "derived",
      "order": 90,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -113,7 +113,7 @@
      "surface_key": "total_liabilities",
      "statement": "balance",
      "label": "Total Liabilities",
-      "category": "surface",
+      "category": "derived",
      "order": 100,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -127,7 +127,7 @@
      "surface_key": "total_equity",
      "statement": "balance",
      "label": "Total Equity",
-      "category": "surface",
+      "category": "equity",
      "order": 110,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -136,6 +136,25 @@
      "formula_fallback": null,
      "detail_grouping_policy": "top_level_only",
      "materiality_policy": "balance_default"
+    },
+    {
+      "surface_key": "capital_expenditures",
+      "statement": "cash_flow",
+      "label": "Capital Expenditures",
+      "category": "investing",
+      "order": 130,
+      "unit": "currency",
+      "rollup_policy": "aggregate_children",
+      "allowed_source_concepts": [
+        "us-gaap:PaymentsToAcquireCommercialRealEstate",
+        "us-gaap:PaymentsForCapitalImprovements",
+        "us-gaap:PaymentsForDepositsOnRealEstateAcquisitions"
+      ],
+      "allowed_authoritative_concepts": [],
+      "formula_fallback": null,
+      "detail_grouping_policy": "group_all_children",
+      "materiality_policy": "cash_flow_default",
+      "sign_transform": "invert"
    }
  ]
 }
--- a/scripts/compare-fiscal-ai-statements.ts
+++ b/scripts/compare-fiscal-ai-statements.ts
@@ -5,7 +5,7 @@ import { hydrateFilingTaxonomySnapshot } from '@/lib/server/taxonomy/engine';
 import type { TaxonomyHydrationInput, TaxonomyHydrationResult } from '@/lib/server/taxonomy/types';

 type ComparisonTarget = {
-  statement: Extract<FinancialStatementKind, 'income' | 'balance'>;
+  statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>;
  surfaceKey: string;
  fiscalAiLabels: string[];
  allowNotMeaningful?: boolean;
@@ -46,7 +46,7 @@ type FiscalAiTable = {
 };

 type ComparisonRow = {
-  statement: Extract<FinancialStatementKind, 'income' | 'balance'>;
+  statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>;
  surfaceKey: string;
  fiscalAiLabel: string | null;
  fiscalAiValueM: number | null;
@@ -89,6 +89,11 @@ const CASES: CompanyCase[] = [
        surfaceKey: 'net_income',
        fiscalAiLabels: ['Net Income Attributable to Common Shareholders', 'Consolidated Net Income', 'Net Income']
      },
+      { statement: 'balance', surfaceKey: 'current_assets', fiscalAiLabels: ['Current Assets', 'Total Current Assets'] },
+      { statement: 'balance', surfaceKey: 'total_assets', fiscalAiLabels: ['Total Assets'] },
+      { statement: 'cash_flow', surfaceKey: 'operating_cash_flow', fiscalAiLabels: ['Cash from Operating Activities', 'Operating Cash Flow', 'Net Cash from Operations', 'Net Cash Provided by Operating'] },
+      { statement: 'cash_flow', surfaceKey: 'capital_expenditures', fiscalAiLabels: ['Capital Expenditures', 'Capital Expenditure'] },
+      { statement: 'cash_flow', surfaceKey: 'free_cash_flow', fiscalAiLabels: ['Free Cash Flow', 'Levered Free Cash Flow'] },
    ]
  },
  {
@@ -113,6 +118,11 @@ const CASES: CompanyCase[] = [
        surfaceKey: 'net_income',
        fiscalAiLabels: ['Net Income to Common', 'Net Income Attributable to Common Shareholders', 'Net Income']
      },
+      { statement: 'balance', surfaceKey: 'loans', fiscalAiLabels: ['Net Loans', 'Loans', 'Loans Receivable'] },
+      { statement: 'balance', surfaceKey: 'total_assets', fiscalAiLabels: ['Total Assets'] },
+      { statement: 'cash_flow', surfaceKey: 'operating_cash_flow', fiscalAiLabels: ['Cash from Operating Activities', 'Net Cash from Operating Activities', 'Net Cash Provided by Operating'] },
+      { statement: 'cash_flow', surfaceKey: 'investing_cash_flow', fiscalAiLabels: ['Cash from Investing Activities', 'Net Cash from Investing Activities', 'Net Cash Provided by Investing'] },
+      { statement: 'cash_flow', surfaceKey: 'financing_cash_flow', fiscalAiLabels: ['Cash from Financing Activities', 'Net Cash from Financing Activities', 'Net Cash Provided by Financing'] },
    ]
  },
  {
@@ -137,6 +147,18 @@ const CASES: CompanyCase[] = [
        surfaceKey: 'net_income',
        fiscalAiLabels: ['Net Income Attributable to Common Shareholders', 'Consolidated Net Income', 'Net Income']
      },
+      {
+        statement: 'balance',
+        surfaceKey: 'deferred_acquisition_costs',
+        fiscalAiLabels: [
+          'Deferred Acquisition Costs',
+          'Deferred Policy Acquisition Costs',
+          'Deferred Policy Acquisition Costs and Value of Business Acquired'
+        ]
+      },
+      { statement: 'balance', surfaceKey: 'total_assets', fiscalAiLabels: ['Total Assets'] },
+      { statement: 'cash_flow', surfaceKey: 'operating_cash_flow', fiscalAiLabels: ['Cash from Operating Activities', 'Operating Cash Flow', 'Net Cash from Operations', 'Net Cash Provided by Operating'] },
+      { statement: 'cash_flow', surfaceKey: 'free_cash_flow', fiscalAiLabels: ['Free Cash Flow', 'Levered Free Cash Flow'] },
    ]
  },
  {
@@ -154,7 +176,22 @@ const CASES: CompanyCase[] = [
        statement: 'income',
        surfaceKey: 'net_income',
        fiscalAiLabels: ['Net Income Attributable to Common Shareholders', 'Consolidated Net Income', 'Net Income']
-      }
+      },
+      {
+        statement: 'balance',
+        surfaceKey: 'investment_property',
+        fiscalAiLabels: [
+          'Investment Property',
+          'Investment Properties',
+          'Real Estate Investment Property, Net',
+          'Real Estate Investment Property, at Cost',
+          'Total real estate held for investment, at cost'
+        ]
+      },
+      { statement: 'balance', surfaceKey: 'total_assets', fiscalAiLabels: ['Total Assets'] },
+      { statement: 'cash_flow', surfaceKey: 'operating_cash_flow', fiscalAiLabels: ['Cash from Operating Activities', 'Operating Cash Flow', 'Net Cash from Operations', 'Net Cash Provided by Operating'] },
+      { statement: 'cash_flow', surfaceKey: 'capital_expenditures', fiscalAiLabels: ['Capital Expenditures', 'Capital Expenditure'] },
+      { statement: 'cash_flow', surfaceKey: 'free_cash_flow', fiscalAiLabels: ['Free Cash Flow', 'Levered Free Cash Flow'] }
    ]
  },
  {
@@ -184,6 +221,9 @@ const CASES: CompanyCase[] = [
 ];

 function parseTickerFilter(argv: string[]) {
+  let ticker: string | null = null;
+  let statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'> | null = null;
+
  for (const arg of argv) {
    if (arg === '--help' || arg === '-h') {
      console.log('Compare live Fiscal.ai standardized statement rows against local sidecar output.');
@@ -191,16 +231,26 @@ function parseTickerFilter(argv: string[]) {
      console.log('Usage:');
      console.log('  bun run scripts/compare-fiscal-ai-statements.ts');
      console.log('  bun run scripts/compare-fiscal-ai-statements.ts --ticker=MSFT');
+      console.log('  bun run scripts/compare-fiscal-ai-statements.ts --statement=balance');
+      console.log('  bun run scripts/compare-fiscal-ai-statements.ts --statement=cash_flow');
      process.exit(0);
    }

    if (arg.startsWith('--ticker=')) {
      const value = arg.slice('--ticker='.length).trim().toUpperCase();
-      return value.length > 0 ? value : null;
+      ticker = value.length > 0 ? value : null;
+      continue;
+    }
+
+    if (arg.startsWith('--statement=')) {
+      const value = arg.slice('--statement='.length).trim().toLowerCase().replace(/-/g, '_');
+      if (value === 'income' || value === 'balance' || value === 'cash_flow') {
+        statement = value;
+      }
    }
  }

-  return null;
+  return { ticker, statement };
 }

 function normalizeLabel(value: string) {
@@ -295,10 +345,98 @@ function chooseInstantPeriodId(result: TaxonomyHydrationResult) {
  return instantPeriods[0]?.id ?? null;
 }

+function parseColumnLabelPeriodEnd(columnLabel: string) {
+  const match = columnLabel.match(/^([A-Za-z]{3})\s+'?(\d{2,4})$/);
+  if (!match) {
+    return null;
+  }
+
+  const [, monthToken, yearToken] = match;
+  const monthMap: Record<string, number> = {
+    jan: 0,
+    feb: 1,
+    mar: 2,
+    apr: 3,
+    may: 4,
+    jun: 5,
+    jul: 6,
+    aug: 7,
+    sep: 8,
+    oct: 9,
+    nov: 10,
+    dec: 11
+  };
+  const month = monthMap[monthToken.toLowerCase()];
+  if (month === undefined) {
+    return null;
+  }
+
+  const parsedYear = Number.parseInt(yearToken, 10);
+  if (!Number.isFinite(parsedYear)) {
+    return null;
+  }
+
+  const year = yearToken.length === 2 ? 2000 + parsedYear : parsedYear;
+  return { month, year };
+}
+
+function choosePeriodIdForColumnLabel(
+  result: TaxonomyHydrationResult,
+  statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>,
+  columnLabel: string
+) {
+  const parsed = parseColumnLabelPeriodEnd(columnLabel);
+  if (!parsed) {
+    return null;
+  }
+
+  const matchingPeriods = result.periods
+    .filter((period): period is ResultPeriod => {
+      const end = periodEnd(period as ResultPeriod);
+      if (!end) {
+        return false;
+      }
+      const endDate = new Date(end);
+      if (Number.isNaN(endDate.getTime())) {
+        return false;
+      }
+
+      const periodMatchesStatement = statement === 'balance'
+        ? !periodStart(period as ResultPeriod)
+        : Boolean(periodStart(period as ResultPeriod));
+      if (!periodMatchesStatement) {
+        return false;
+      }
+
+      return endDate.getUTCFullYear() === parsed.year && endDate.getUTCMonth() === parsed.month;
+    })
+    .sort((left, right) => {
+      if (statement !== 'balance') {
+        const leftStart = periodStart(left);
+        const rightStart = periodStart(right);
+        const leftDuration = leftStart
+          ? Math.round((Date.parse(periodEnd(left) as string) - Date.parse(leftStart)) / (1000 * 60 * 60 * 24))
+          : -1;
+        const rightDuration = rightStart
+          ? Math.round((Date.parse(periodEnd(right) as string) - Date.parse(rightStart)) / (1000 * 60 * 60 * 24))
+          : -1;
+
+        if (leftDuration !== rightDuration) {
+          return rightDuration - leftDuration;
+        }
+      }
+
+      return Date.parse(periodEnd(right) as string) - Date.parse(periodEnd(left) as string);
+    });
+
+  return matchingPeriods[0]?.id ?? null;
+}
+
 function findSurfaceValue(
  result: TaxonomyHydrationResult,
-  statement: Extract<FinancialStatementKind, 'income' | 'balance'>,
-  surfaceKey: string
+  statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>,
+  surfaceKey: string,
+  referenceColumnLabel?: string
 ) {
  const rows = result.surface_rows[statement] ?? [];
  const row = rows.find((entry) => entry.key === surfaceKey) ?? null;
@@ -306,9 +444,11 @@ function findSurfaceValue(
    return { row: null, value: null };
  }

-  const periodId = statement === 'balance'
+  const periodId = (referenceColumnLabel
+    ? choosePeriodIdForColumnLabel(result, statement, referenceColumnLabel)
+    : null) ?? (statement === 'balance'
    ? chooseInstantPeriodId(result)
-    : chooseDurationPeriodId(result);
+    : chooseDurationPeriodId(result));

  if (periodId) {
    const directValue = row.values[periodId];
@@ -412,14 +552,24 @@ async function fetchLatestAnnualFiling(company: CompanyCase): Promise<TaxonomyHy
 async function scrapeFiscalAiTable(
  page: import('@playwright/test').Page,
  exchangeTicker: string,
-  statement: 'income' | 'balance'
+  statement: 'income' | 'balance' | 'cash_flow'
 ): Promise<FiscalAiTable> {
-  const pagePath = statement === 'income' ? 'income-statement' : 'balance-sheet';
+  const pagePath = statement === 'income'
+    ? 'income-statement'
+    : statement === 'balance'
+      ? 'balance-sheet'
+      : 'cash-flow-statement';
  const url = `https://fiscal.ai/company/${exchangeTicker}/financials/${pagePath}/annual/?templateType=standardized`;

  await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 120_000 });
  await page.waitForSelector('table', { timeout: 120_000 });
  await page.waitForTimeout(2_500);
+  await page.evaluate(async () => {
+    window.scrollTo(0, document.body.scrollHeight);
+    await new Promise((resolve) => setTimeout(resolve, 750));
+    window.scrollTo(0, 0);
+    await new Promise((resolve) => setTimeout(resolve, 250));
+  });

  return await page.evaluate(() => {
    function normalizeLabel(value: string) {
@@ -452,45 +602,52 @@ async function scrapeFiscalAiTable(
      return Number.isFinite(parsed) ? (negative ? -Math.abs(parsed) : parsed) : null;
    }

-    const table = document.querySelector('table');
-    if (!table) {
+    const tables = Array.from(document.querySelectorAll('table'));
+    if (tables.length === 0) {
      throw new Error('Fiscal.ai table not found');
    }

+    const rowsByLabel = new Map<string, FiscalAiTableRow>();
+    let columnLabel = 'unknown';
+
+    for (const table of tables) {
      const headerCells = Array.from(table.querySelectorAll('tr:first-child th, tr:first-child td'))
        .map((cell) => cell.textContent?.trim() ?? '')
        .filter((value) => value.length > 0);
-
      const annualColumnIndex = headerCells.findIndex((value, index) => index > 0 && value !== 'LTM');
      if (annualColumnIndex < 0) {
-      throw new Error(`Could not locate latest annual column in headers: ${headerCells.join(' | ')}`);
+        continue;
      }

-    const rows = Array.from(table.querySelectorAll('tr'))
-      .slice(1)
-      .map((row) => {
+      if (columnLabel === 'unknown') {
+        columnLabel = headerCells[annualColumnIndex] ?? 'unknown';
+      }
+
+      for (const row of Array.from(table.querySelectorAll('tr')).slice(1)) {
        const cells = Array.from(row.querySelectorAll('td'));
        if (cells.length <= annualColumnIndex) {
-          return null;
+          continue;
        }

        const label = cells[0]?.textContent?.trim() ?? '';
        const valueText = cells[annualColumnIndex]?.textContent?.trim() ?? '';
        if (!label) {
-          return null;
+          continue;
        }

-        return {
+        rowsByLabel.set(label, {
          label,
          normalizedLabel: normalizeLabel(label),
          valueText,
          value: parseDisplayedNumber(valueText)
-        };
-      })
-      .filter((entry): entry is FiscalAiTableRow => entry !== null);
+        });
+      }
+    }
+
+    const rows = Array.from(rowsByLabel.values());

    return {
-      columnLabel: headerCells[annualColumnIndex] ?? 'unknown',
+      columnLabel,
      rows
    };
  });
@@ -536,7 +693,7 @@ function compareRow(
 ): ComparisonRow {
  const fiscalAiRow = findFiscalAiRow(fiscalAiTable.rows, target.fiscalAiLabels);
  const fiscalAiValueM = fiscalAiRow?.value ?? null;
-  const ourSurface = findSurfaceValue(result, target.statement, target.surfaceKey);
+  const ourSurface = findSurfaceValue(result, target.statement, target.surfaceKey, fiscalAiTable.columnLabel);
  const ourValueM = roundMillions(ourSurface.value);
  const absDiffM = absoluteDiff(ourValueM, fiscalAiValueM);
  const relDiffValue = relativeDiff(ourValueM, fiscalAiValueM);
@@ -587,17 +744,34 @@ async function compareCase(page: import('@playwright/test').Page, company: Compa
    throw new Error(`${company.ticker} parse_status=${result.parse_status}${result.parse_error ? ` parse_error=${result.parse_error}` : ''}`);
  }

-  const incomeTable = await scrapeFiscalAiTable(page, company.exchangeTicker, 'income');
-  const balanceTable = await scrapeFiscalAiTable(page, company.exchangeTicker, 'balance');
+  const statementKinds = new Set(company.comparisons.map((target) => target.statement));
+  const incomeTable = statementKinds.has('income')
+    ? await scrapeFiscalAiTable(page, company.exchangeTicker, 'income')
+    : null;
+  const balanceTable = statementKinds.has('balance')
+    ? await scrapeFiscalAiTable(page, company.exchangeTicker, 'balance')
+    : null;
+  const cashFlowTable = statementKinds.has('cash_flow')
+    ? await scrapeFiscalAiTable(page, company.exchangeTicker, 'cash_flow')
+    : null;
  const rows = company.comparisons.map((target) => {
-    const table = target.statement === 'income' ? incomeTable : balanceTable;
+    const table = target.statement === 'income'
+      ? incomeTable
+      : target.statement === 'balance'
+        ? balanceTable
+        : cashFlowTable;
+    if (!table) {
+      throw new Error(`Missing scraped table for ${target.statement}`);
+    }
    return compareRow(target, result, table);
  });

-  const failures = rows.filter((row) => row.status === 'fail' || row.status === 'missing_ours');
+  const failures = rows.filter(
+    (row) => row.status === 'fail' || row.status === 'missing_ours' || row.status === 'missing_reference'
+  );

  console.log(
-    `[compare-fiscal-ai] ${company.ticker} filing=${filing.accessionNumber} fiscal_pack=${result.fiscal_pack ?? 'null'} income_column="${incomeTable.columnLabel}" balance_column="${balanceTable.columnLabel}" pass=${rows.length - failures.length}/${rows.length}`
+    `[compare-fiscal-ai] ${company.ticker} filing=${filing.accessionNumber} fiscal_pack=${result.fiscal_pack ?? 'null'} income_column="${incomeTable?.columnLabel ?? 'n/a'}" balance_column="${balanceTable?.columnLabel ?? 'n/a'}" cash_flow_column="${cashFlowTable?.columnLabel ?? 'n/a'}" pass=${rows.length - failures.length}/${rows.length}`
  );
  for (const row of rows) {
    console.log(
@@ -625,18 +799,28 @@ async function compareCase(page: import('@playwright/test').Page, company: Compa

 async function main() {
  process.env.XBRL_ENGINE_TIMEOUT_MS = process.env.XBRL_ENGINE_TIMEOUT_MS ?? '180000';
-  const tickerFilter = parseTickerFilter(process.argv.slice(2));
-  const selectedCases = tickerFilter
-    ? CASES.filter((entry) => entry.ticker === tickerFilter)
-    : CASES;
+  const filters = parseTickerFilter(process.argv.slice(2));
+  const selectedCases = (filters.ticker
+    ? CASES.filter((entry) => entry.ticker === filters.ticker)
+    : CASES
+  )
+    .map((entry) => ({
+      ...entry,
+      comparisons: filters.statement
+        ? entry.comparisons.filter((target) => target.statement === filters.statement)
+        : entry.comparisons
+    }))
+    .filter((entry) => entry.comparisons.length > 0);

  if (selectedCases.length === 0) {
-    console.error(`[compare-fiscal-ai] unknown ticker: ${tickerFilter}`);
+    console.error(
+      `[compare-fiscal-ai] no matching cases for ticker=${filters.ticker ?? 'all'} statement=${filters.statement ?? 'all'}`
+    );
    process.exitCode = 1;
    return;
  }

-  const browser = await chromium.launch({ headless: false });
+  const browser = await chromium.launch({ headless: true });
  const page = await browser.newPage({
    userAgent: BROWSER_USER_AGENT
  });