Expand backend financial statement parsers

2026-03-12 21:15:54 -04:00
parent 33ce48f53c
commit 7a7a78340f
13 changed files with 4398 additions and 456 deletions
--- a/rust/fiscal-xbrl-core/BALANCE_SHEET_PARSER_SPEC.md
+++ b/rust/fiscal-xbrl-core/BALANCE_SHEET_PARSER_SPEC.md
@@ -0,0 +1,144 @@
 # Balance Sheet Parser Spec
 ## Purpose
 This document defines the backend-only balance-sheet parsing rules for `fiscal-xbrl-core`.
 This pass is limited to Rust parser behavior and taxonomy packs. It must not modify frontend files, frontend rendering logic, or frontend response shapes.
 ## Hydration Order
 1. Load the selected surface pack.
 2. For non-core packs, merge in any core balance-sheet surfaces that the selected pack does not override.
 3. Resolve direct canonical balance rows from statement rows.
 4. Resolve aggregate-child rows from detail components when direct canonical rows are absent.
 5. Resolve formula-backed balance rows from already-resolved canonical rows.
 6. Emit `unmapped` only for rows not consumed by canonical balance parsing.
 ## Category Taxonomy
 Balance rows use these backend category keys:
 - `current_assets`
 - `noncurrent_assets`
 - `current_liabilities`
 - `noncurrent_liabilities`
 - `equity`
 - `derived`
 - `sector_specific`
 Default rule:
 - use economic placement first
 - reserve `sector_specific` for rows that cannot be expressed economically
 ## Canonical Precedence Rule
 Canonical balance mappings take precedence over residual classification.
 If a statement row is consumed by a canonical balance row, it must not remain in `detail_rows["balance"]["unmapped"]`.
 ## Alias Flattening Rule
 Synonymous balance concepts flatten into one canonical surface row.
 Example:
 - `AccountsReceivableNetCurrent`
 - `ReceivablesNetCurrent`
 These must become one `accounts_receivable` row with period-aware provenance.
 ## Per-Period Resolution Rule
 Direct balance matching is resolved per period, not by choosing one row globally.
 For each canonical balance row:
 1. Collect all direct candidates.
 2. For each period, choose the best candidate with a value in that period.
 3. Build one canonical row from those period-specific winners.
 4. Preserve the union of all consumed aliases in `source_concepts`, `source_row_keys`, and `source_fact_ids`.
 ## Formula Evaluation Rule
 Structured formulas are evaluated only after their source surface rows have been resolved.
 Supported operators:
 - `sum`
 - `subtract`
 Formula rules:
 - formulas operate period by period
 - `sum` may treat nulls as zero when `treat_null_as_zero` is true
 - `subtract` requires exactly two sources
 - formula rows inherit provenance from the source surface rows they consume
 ## Residual Pruning Rule
 `balance.unmapped` is a strict remainder set.
 A balance statement row must be excluded from `unmapped` when either of these is true:
 - its row key was consumed by a canonical balance row
 - its concept key was consumed by a canonical balance row
 ## Helper Surface Rule
 Some balance rows are parser helpers rather than user-facing canonical output.
 Current helper rows:
 - `deferred_revenue_current`
 - `deferred_revenue_noncurrent`
 - `current_liabilities`
 - `leases`
 Behavior:
 - they remain available to formulas
 - they do not appear in emitted `surface_rows`
 - they do not create emitted detail buckets
 - they still consume matched backend sources so those rows do not leak into `unmapped`
 ## Synonym vs Aggregate Child Rule
 Two cases must remain distinct.
 ### Synonym aliases
 Different concept names for the same canonical balance meaning.
 Behavior:
 - flatten into one canonical surface row
 - do not emit duplicate detail rows
 - do not remain in `unmapped`
 ### Aggregate child components
 Rows that legitimately roll into a subtotal or total.
 Behavior:
 - may remain as detail rows beneath the canonical parent when grouping is enabled
 - must not remain in `unmapped` after being consumed
 ## Sector Placement Decisions
 Sector rows stay inside the same economic taxonomy.
 Mappings in this pass:
 - `loans` -> `noncurrent_assets`
 - `allowance_for_credit_losses` -> `noncurrent_assets`
 - `deposits` -> `current_liabilities`
 - `policy_liabilities` -> `noncurrent_liabilities`
 - `deferred_acquisition_costs` -> `noncurrent_assets`
 - `investment_property` -> `noncurrent_assets`
 `sector_specific` remains unused by default in this pass.
 ## Required Invariants
 - A consumed balance source must never remain in `balance.unmapped`.
 - A synonym alias must never create more than one canonical balance row.
 - Hidden helper surfaces may consume sources but must not appear in emitted `surface_rows`.
 - Formula-derived rows inherit canonical provenance from their source surfaces.
 - The frontend response shape remains unchanged.
 ## Test Matrix
 The parser must cover:
 - direct alias flattening for `accounts_receivable`
 - period-sparse alias merges into one canonical row
 - formula derivation for `total_cash_and_equivalents`
 - formula derivation for `unearned_revenue`
 - formula derivation for `total_debt`
 - formula derivation for `net_cash_position`
 - helper rows staying out of emitted balance surfaces
 - residual pruning of canonically consumed balance rows
 - sector packs receiving merged core balance coverage without changing frontend contracts
 ## Learnings Reusable For Other Statements
 The same parser rules should later apply to cash flow:
 - canonical mapping outranks residual classification
 - direct aliases should resolve per period
 - helper rows can exist backend-only when formulas need them
 - consumed sources must be removed from `unmapped`
 - sector packs should inherit common canonical coverage rather than duplicating it
--- a/rust/fiscal-xbrl-core/CASH_FLOW_STATEMENT_PARSER_SPEC.md
+++ b/rust/fiscal-xbrl-core/CASH_FLOW_STATEMENT_PARSER_SPEC.md
@@ -0,0 +1,155 @@
 # Cash Flow Statement Parser Spec
 ## Purpose
 This document defines the backend-only cash-flow parsing rules for `fiscal-xbrl-core`.
 This pass is limited to Rust parser behavior, taxonomy packs, and backend comparison tooling. It must not modify frontend files, frontend rendering logic, or frontend response shapes.
 ## Hydration Order
 1. Load the selected surface pack.
 2. For non-core packs, merge in any core balance-sheet and cash-flow surfaces that the selected pack does not override.
 3. Resolve direct canonical cash-flow rows from statement rows.
 4. Resolve aggregate-child cash-flow rows from matched detail components when direct canonical rows are absent.
 5. Resolve formula-backed cash-flow rows from already-resolved canonical rows and helper rows.
 6. Emit `unmapped` only for rows not consumed by canonical cash-flow parsing.
 ## Category Model
 Cash-flow rows use these backend category keys:
 - `operating`
 - `investing`
 - `financing`
 - `free_cash_flow`
 - `helper`
 Rules:
 - `helper` rows are backend-only and use `include_in_output: false`.
 - Only `operating`, `investing`, `financing`, and `free_cash_flow` should appear in emitted `surface_rows`.
 ## Canonical Precedence Rule
 Canonical cash-flow mappings take precedence over residual classification.
 If a statement row is consumed by a canonical cash-flow row, it must not remain in `detail_rows["cash_flow"]["unmapped"]`.
 ## Alias Flattening Rule
 Synonymous cash-flow concepts flatten into one canonical surface row.
 Example:
 - `NetCashProvidedByUsedInOperatingActivities`
 - `NetCashProvidedByUsedInOperatingActivitiesContinuingOperations`
 These must become one `operating_cash_flow` row with period-aware provenance.
 ## Per-Period Resolution Rule
 Direct cash-flow matching is resolved per period, not by choosing one row globally.
 For each canonical cash-flow row:
 1. Collect all direct candidates.
 2. For each period, choose the best candidate with a value in that period.
 3. Build one canonical row from those period-specific winners.
 4. Preserve the union of all consumed aliases in `source_concepts`, `source_row_keys`, and `source_fact_ids`.
 ## Sign Normalization Rule
 Some canonical cash-flow rows require sign normalization.
 Supported transform:
 - `invert`
 Rule:
 - sign transforms are applied after direct or aggregate resolution
 - sign transforms are applied before formula evaluation consumes the row
 - emitted detail rows inherit the same transform when they belong to the transformed canonical row
 - provenance is preserved unchanged
 ## Formula Rule
 Structured formulas are evaluated only after their source surface rows have been resolved.
 Supported operators:
 - `sum`
 - `subtract`
 Current formulas:
 - `changes_unearned_revenue = contract_liability_incurred - contract_liability_recognized`
 - `changes_other_operating_activities = changes_other_current_assets + changes_other_current_liabilities + changes_other_noncurrent_assets + changes_other_noncurrent_liabilities`
 - `free_cash_flow = operating_cash_flow + capital_expenditures`
 ## Helper Row Rule
 Helper rows exist only to support formulas and canonical grouping.
 Current helper rows:
 - `contract_liability_incurred`
 - `contract_liability_recognized`
 - `changes_other_current_assets`
 - `changes_other_current_liabilities`
 - `changes_other_noncurrent_assets`
 - `changes_other_noncurrent_liabilities`
 Behavior:
 - helper rows remain available for formula evaluation
 - helper rows do not appear in emitted `surface_rows`
 - helper rows do not create emitted detail buckets
 - helper rows still consume matched backend sources so those rows do not leak into `unmapped`
 ## Residual Pruning Rule
 `cash_flow.unmapped` is a strict remainder set.
 A cash-flow statement row must be excluded from `unmapped` when either of these is true:
 - its row key was consumed by a canonical cash-flow row
 - its concept key was consumed by a canonical cash-flow row
 ## Sector Inheritance Rule
 Sector packs inherit the core cash-flow taxonomy unless they provide an explicit cash-flow override.
 Current behavior:
 - bank/lender inherits core cash-flow rows
 - broker/asset manager inherits core cash-flow rows
 - insurance inherits core cash-flow rows
 - REIT/real estate inherits core cash-flow rows
 No first-pass sector-specific cash-flow overrides are required.
 ## Synonym vs Aggregate Child Rule
 Two cases must remain distinct.
 ### Synonym aliases
 Different concept names for the same canonical cash-flow meaning.
 Behavior:
 - flatten into one canonical surface row
 - do not emit duplicate detail rows
 - do not remain in `unmapped`
 ### Aggregate child components
 Rows that legitimately roll into a subtotal or grouped adjustment row.
 Behavior:
 - may remain as detail rows beneath the canonical parent when grouping is enabled
 - must not remain in `unmapped` after being consumed
 ## Required Invariants
 - A consumed cash-flow source must never remain in `cash_flow.unmapped`.
 - A synonym alias must never create more than one canonical cash-flow row.
 - Hidden helper surfaces may consume sources but must not appear in emitted `surface_rows`.
 - Formula-derived rows inherit canonical provenance from their source surfaces.
 - The frontend response shape remains unchanged.
 ## Test Matrix
 The parser must cover:
 - direct sign inversion for `capital_expenditures`
 - direct sign inversion for `debt_repaid`
 - direct sign inversion for `share_repurchases`
 - direct mapping for `operating_cash_flow`
 - formula derivation for `changes_unearned_revenue`
 - formula derivation for `changes_other_operating_activities`
 - formula derivation for `free_cash_flow`
 - helper rows staying out of emitted cash-flow surfaces
 - residual pruning of canonically consumed cash-flow rows
 - sector packs receiving merged core cash-flow coverage without changing frontend contracts
 - fallback classification for fact-only cash-flow concepts such as `IncreaseDecreaseInAccountsReceivable` and `PaymentsOfDividends`
 ## Learnings Reusable For Other Statements
 The same parser rules now apply consistently across income, balance, and cash flow:
 - canonical mapping outranks residual classification
 - direct aliases resolve per period
 - helper rows may exist backend-only when formulas need them
 - consumed sources must be removed from `unmapped`
 - sector packs inherit common canonical coverage instead of duplicating it
--- a/rust/fiscal-xbrl-core/OPERATING_STATEMENT_PARSER_SPEC.md
+++ b/rust/fiscal-xbrl-core/OPERATING_STATEMENT_PARSER_SPEC.md
@@ -0,0 +1,103 @@
 # Operating Statement Parser Spec
 ## Purpose
 This document defines the backend-only parsing rules for operating statement hydration in `fiscal-xbrl-core`.
 This pass is intentionally limited to Rust parser behavior. It must not change frontend files, frontend rendering logic, or API response shapes.
 ## Hydration Order
 1. Generic compact surface mapping builds initial `surface_rows`, `detail_rows`, and `unmapped` residuals.
 2. Universal income parsing rewrites the income statement into canonical operating-statement rows.
 3. Canonical income parsing is authoritative for income provenance and must prune any consumed residual rows from `detail_rows["income"]["unmapped"]`.
 ## Canonical Precedence Rule
 For income rows, canonical universal mappings take precedence over generic residual classification.
 If an income concept is consumed by a canonical operating-statement row, it must not remain in `unmapped`.
 ## Alias Flattening Rule
 Multiple source aliases for the same canonical operating-statement concept must flatten into a single canonical surface row.
 Examples:
 - `us-gaap:OtherOperatingExpense`
 - `us-gaap:OtherOperatingExpenses`
 - `us-gaap:OtherCostAndExpenseOperating`
 These may differ by filer or period, but they still represent one canonical row such as `other_operating_expense`.
 ## Per-Period Resolution Rule
 Direct canonical matching is resolved per period, not by selecting one global winner for all periods.
 For each canonical income row:
 1. Collect all direct statement-row matches.
 2. For each period, keep only candidates with a value in that period.
 3. Choose the best candidate for that period using existing ranking rules.
 4. Build one canonical row whose `values` and `resolved_source_row_keys` are assembled period-by-period.
 The canonical row's provenance is the union of all consumed aliases, even if a different alias wins in different periods.
 ## Residual Pruning Rule
 After canonical income rows are resolved:
 - collect all consumed source row keys
 - collect all consumed concept keys
 - remove any residual income detail row from `unmapped` if either identifier matches
 `unmapped` is a strict remainder set after income canonicalization.
 ## Synonym vs Aggregate Child Rule
 Two cases must remain distinct:
 ### Synonym aliases
 Different concept names representing the same canonical meaning.
 Behavior:
 - flatten into one canonical surface row
 - do not emit as detail rows
 - do not leave in `unmapped`
 ### Aggregate child components
 Rows that are true components of a higher-level canonical row, such as:
 - `SalesAndMarketingExpense`
 - `GeneralAndAdministrativeExpense`
 used to derive `selling_general_and_administrative`
 Behavior:
 - may appear as detail rows under the canonical parent
 - must not also remain in `unmapped` once consumed by that canonical parent
 ## Required Invariants
 For income parsing, a consumed source may appear in exactly one of these places:
 - canonical surface provenance
 - canonical detail provenance
 - `unmapped`
 It must never appear in more than one place at the same time.
 Additional invariants:
 - canonical surface rows are unique by canonical key
 - aliases are flattened into one canonical row
 - `resolved_source_row_keys` are period-specific
 - normalization counts reflect the post-pruning state
 ## Performance Constraints
 - Use `HashSet` membership for consumed-source pruning.
 - Build candidate collections once per canonical definition.
 - Avoid UI-side dedupe or post-processing.
 - Keep the parser close to linear in candidate volume per definition.
 ## Test Matrix
 The parser must cover:
 - direct alias dedupe for `other_operating_expense`
 - period-sparse alias merge into a single canonical row
 - pruning of canonically consumed aliases from `income.unmapped`
 - preservation of truly unrelated residual rows
 - pruning of formula-consumed component rows from `income.unmapped`
 ## Learnings For Other Statements
 The same backend rules should later be applied to balance sheet and cash flow:
 - canonical mapping must outrank residual classification
 - alias resolution should be per-period
 - consumed sources must be removed from `unmapped`
 - synonym aliases and aggregate child components must be treated differently
 When balance sheet and cash flow are upgraded, they should adopt these invariants without changing frontend response shapes.
--- a/rust/fiscal-xbrl-core/src/lib.rs
+++ b/rust/fiscal-xbrl-core/src/lib.rs
@@ -37,10 +37,12 @@ static IDENTIFIER_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?identifier\b[^>]*\bscheme=["']([^"']+)["'][^>]*>(.*?)</(?:[a-z0-9_\-]+:)?identifier>"#).unwrap()
 });
 static SEGMENT_RE: Lazy<Regex> = Lazy::new(|| {
-    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?segment\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?segment>"#).unwrap()
+    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?segment\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?segment>"#)
        .unwrap()
 });
 static SCENARIO_RE: Lazy<Regex> = Lazy::new(|| {
-    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?scenario\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?scenario>"#).unwrap()
+    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?scenario\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?scenario>"#)
        .unwrap()
 });
 static START_DATE_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?startDate>(.*?)</(?:[a-z0-9_\-]+:)?startDate>"#).unwrap()
@@ -55,7 +57,8 @@ static MEASURE_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?measure>(.*?)</(?:[a-z0-9_\-]+:)?measure>"#).unwrap()
 });
 static LABEL_LINK_RE: Lazy<Regex> = Lazy::new(|| {
-    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?labelLink\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?labelLink>"#).unwrap()
+    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?labelLink\b[^>]*>(.*?)</(?:[a-z0-9_\-]+:)?labelLink>"#)
        .unwrap()
 });
 static PRESENTATION_LINK_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?presentationLink\b([^>]*)>(.*?)</(?:[a-z0-9_\-]+:)?presentationLink>"#).unwrap()
@@ -67,12 +70,14 @@ static LABEL_RESOURCE_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?label\b([^>]*)>(.*?)</(?:[a-z0-9_\-]+:)?label>"#).unwrap()
 });
 static LABEL_ARC_RE: Lazy<Regex> = Lazy::new(|| {
-    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?labelArc\b([^>]*)/?>(?:</(?:[a-z0-9_\-]+:)?labelArc>)?"#).unwrap()
+    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?labelArc\b([^>]*)/?>(?:</(?:[a-z0-9_\-]+:)?labelArc>)?"#)
        .unwrap()
 });
 static PRESENTATION_ARC_RE: Lazy<Regex> = Lazy::new(|| {
    Regex::new(r#"(?is)<(?:[a-z0-9_\-]+:)?presentationArc\b([^>]*)/?>(?:</(?:[a-z0-9_\-]+:)?presentationArc>)?"#).unwrap()
 });
-static ATTR_RE: Lazy<Regex> = Lazy::new(|| Regex::new(r#"([a-zA-Z0-9:_\-]+)=["']([^"']+)["']"#).unwrap());
+static ATTR_RE: Lazy<Regex> =
    Lazy::new(|| Regex::new(r#"([a-zA-Z0-9:_\-]+)=["']([^"']+)["']"#).unwrap());
 #[derive(Debug, Deserialize)]
 #[serde(rename_all = "camelCase")]
@@ -451,7 +456,8 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
        });
    };
-    let instance_text = fetch_text(&client, &instance_asset.url).context("fetch request failed for XBRL instance")?;
+    let instance_text = fetch_text(&client, &instance_asset.url)
        .context("fetch request failed for XBRL instance")?;
    let parsed_instance = parse_xbrl_instance(&instance_text, Some(instance_asset.name.clone()));
    let mut label_by_concept = HashMap::new();
@@ -459,11 +465,9 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
    let mut source = "xbrl_instance".to_string();
    let mut parse_error = None;
-    for asset in discovered
+    for asset in discovered.assets.iter().filter(|asset| {
-        .assets
+        asset.is_selected && (asset.asset_type == "presentation" || asset.asset_type == "label")
-        .iter()
+    }) {
        .filter(|asset| asset.is_selected && (asset.asset_type == "presentation" || asset.asset_type == "label"))
    {
        match fetch_text(&client, &asset.url) {
            Ok(content) => {
                if asset.asset_type == "presentation" {
@@ -515,10 +519,15 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
        pack_selection.pack,
        &mut compact_model,
    )?;
-    let kpi_result = kpi_mapper::build_taxonomy_kpis(&materialized.periods, &facts, pack_selection.pack)?;
+    let kpi_result =
        kpi_mapper::build_taxonomy_kpis(&materialized.periods, &facts, pack_selection.pack)?;
    compact_model.normalization_summary.kpi_row_count = kpi_result.rows.len();
    for warning in kpi_result.warnings {
-        if !compact_model.normalization_summary.warnings.contains(&warning) {
+        if !compact_model
            .normalization_summary
            .warnings
            .contains(&warning)
        {
            compact_model.normalization_summary.warnings.push(warning);
        }
    }
@@ -526,7 +535,11 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
        &mut compact_model.concept_mappings,
        kpi_result.mapping_assignments,
    );
-    surface_mapper::apply_mapping_assignments(&mut concepts, &mut facts, &compact_model.concept_mappings);
+    surface_mapper::apply_mapping_assignments(
        &mut concepts,
        &mut facts,
        &compact_model.concept_mappings,
    );
    let has_rows = materialized
        .statement_rows
@@ -572,7 +585,11 @@ pub fn hydrate_filing(input: HydrateFilingRequest) -> Result<HydrateFilingRespon
        concepts_count: concepts.len(),
        dimensions_count: facts
            .iter()
-            .flat_map(|fact| fact.dimensions.iter().map(|dimension| format!("{}::{}", dimension.axis, dimension.member)))
+            .flat_map(|fact| {
                fact.dimensions
                    .iter()
                    .map(|dimension| format!("{}::{}", dimension.axis, dimension.member))
            })
            .collect::<HashSet<_>>()
            .len(),
        assets: discovered.assets,
@@ -622,7 +639,10 @@ struct DiscoveredAssets {
    assets: Vec<AssetOutput>,
 }
-fn discover_filing_assets(input: &HydrateFilingRequest, client: &Client) -> Result<DiscoveredAssets> {
+fn discover_filing_assets(
    input: &HydrateFilingRequest,
    client: &Client,
 ) -> Result<DiscoveredAssets> {
    let Some(directory_url) = resolve_filing_directory_url(
        input.filing_url.as_deref(),
        &input.cik,
@@ -631,12 +651,19 @@ fn discover_filing_assets(input: &HydrateFilingRequest, client: &Client) -> Resu
        return Ok(DiscoveredAssets { assets: vec![] });
    };
-    let payload = fetch_json::<FilingDirectoryPayload>(client, &format!("{directory_url}index.json")).ok();
+    let payload =
        fetch_json::<FilingDirectoryPayload>(client, &format!("{directory_url}index.json")).ok();
    let mut discovered = Vec::new();
-    if let Some(items) = payload.and_then(|payload| payload.directory.and_then(|directory| directory.item)) {
+    if let Some(items) =
        payload.and_then(|payload| payload.directory.and_then(|directory| directory.item))
    {
        for item in items {
-            let Some(name) = item.name.map(|name| name.trim().to_string()).filter(|name| !name.is_empty()) else {
+            let Some(name) = item
                .name
                .map(|name| name.trim().to_string())
                .filter(|name| !name.is_empty())
            else {
                continue;
            };
@@ -683,12 +710,19 @@ fn discover_filing_assets(input: &HydrateFilingRequest, client: &Client) -> Resu
                score_instance(&asset.name, input.primary_document.as_deref()),
            )
        })
-        .max_by(|left, right| left.1.partial_cmp(&right.1).unwrap_or(std::cmp::Ordering::Equal))
+        .max_by(|left, right| {
            left.1
                .partial_cmp(&right.1)
                .unwrap_or(std::cmp::Ordering::Equal)
        })
        .map(|entry| entry.0);
    for asset in &mut discovered {
        asset.score = if asset.asset_type == "instance" {
-            Some(score_instance(&asset.name, input.primary_document.as_deref()))
+            Some(score_instance(
                &asset.name,
                input.primary_document.as_deref(),
            ))
        } else if asset.asset_type == "pdf" {
            Some(score_pdf(&asset.name, asset.size_bytes))
        } else {
@@ -708,7 +742,11 @@ fn discover_filing_assets(input: &HydrateFilingRequest, client: &Client) -> Resu
    Ok(DiscoveredAssets { assets: discovered })
 }
-fn resolve_filing_directory_url(filing_url: Option<&str>, cik: &str, accession_number: &str) -> Option<String> {
+fn resolve_filing_directory_url(
    filing_url: Option<&str>,
    cik: &str,
    accession_number: &str,
 ) -> Option<String> {
    if let Some(filing_url) = filing_url.map(str::trim).filter(|value| !value.is_empty()) {
        if let Some(last_slash) = filing_url.rfind('/') {
            if last_slash > "https://".len() {
@@ -725,7 +763,10 @@ fn resolve_filing_directory_url(filing_url: Option<&str>, cik: &str, accession_n
 }
 fn normalize_cik_for_path(value: &str) -> Option<String> {
-    let digits = value.chars().filter(|char| char.is_ascii_digit()).collect::<String>();
+    let digits = value
        .chars()
        .filter(|char| char.is_ascii_digit())
        .collect::<String>();
    if digits.is_empty() {
        return None;
    }
@@ -741,16 +782,25 @@ fn classify_asset_type(name: &str) -> &'static str {
        return "schema";
    }
    if lower.ends_with(".xml") {
-        if lower.ends_with("_pre.xml") || lower.ends_with("-pre.xml") || lower.contains("presentation") {
+        if lower.ends_with("_pre.xml")
            || lower.ends_with("-pre.xml")
            || lower.contains("presentation")
        {
            return "presentation";
        }
        if lower.ends_with("_lab.xml") || lower.ends_with("-lab.xml") || lower.contains("label") {
            return "label";
        }
-        if lower.ends_with("_cal.xml") || lower.ends_with("-cal.xml") || lower.contains("calculation") {
+        if lower.ends_with("_cal.xml")
            || lower.ends_with("-cal.xml")
            || lower.contains("calculation")
        {
            return "calculation";
        }
-        if lower.ends_with("_def.xml") || lower.ends_with("-def.xml") || lower.contains("definition") {
+        if lower.ends_with("_def.xml")
            || lower.ends_with("-def.xml")
            || lower.contains("definition")
        {
            return "definition";
        }
        return "instance";
@@ -779,7 +829,11 @@ fn score_instance(name: &str, primary_document: Option<&str>) -> f64 {
            score += 5.0;
        }
    }
-    if lower.contains("cal") || lower.contains("def") || lower.contains("lab") || lower.contains("pre") {
+    if lower.contains("cal")
        || lower.contains("def")
        || lower.contains("lab")
        || lower.contains("pre")
    {
        score -= 3.0;
    }
    score
@@ -819,7 +873,9 @@ fn fetch_text(client: &Client, url: &str) -> Result<String> {
    if !response.status().is_success() {
        return Err(anyhow!("request failed for {url} ({})", response.status()));
    }
-    response.text().with_context(|| format!("unable to read response body for {url}"))
+    response
        .text()
        .with_context(|| format!("unable to read response body for {url}"))
 }
 fn fetch_json<T: for<'de> Deserialize<'de>>(client: &Client, url: &str) -> Result<T> {
@@ -847,17 +903,36 @@ fn parse_xbrl_instance(raw: &str, source_file: Option<String>) -> ParsedInstance
    let mut facts = Vec::new();
    for captures in FACT_RE.captures_iter(raw) {
-        let prefix = captures.get(1).map(|value| value.as_str().trim()).unwrap_or_default();
+        let prefix = captures
-        let local_name = captures.get(2).map(|value| value.as_str().trim()).unwrap_or_default();
+            .get(1)
-        let attrs = captures.get(3).map(|value| value.as_str()).unwrap_or_default();
+            .map(|value| value.as_str().trim())
-        let body = decode_xml_entities(captures.get(4).map(|value| value.as_str()).unwrap_or_default().trim());
+            .unwrap_or_default();
        let local_name = captures
            .get(2)
            .map(|value| value.as_str().trim())
            .unwrap_or_default();
        let attrs = captures
            .get(3)
            .map(|value| value.as_str())
            .unwrap_or_default();
        let body = decode_xml_entities(
            captures
                .get(4)
                .map(|value| value.as_str())
                .unwrap_or_default()
                .trim(),
        );
        if prefix.is_empty() || local_name.is_empty() || is_xbrl_infrastructure_prefix(prefix) {
            continue;
        }
        let attr_map = parse_attrs(attrs);
-        let Some(context_id) = attr_map.get("contextRef").cloned().or_else(|| attr_map.get("contextref").cloned()) else {
+        let Some(context_id) = attr_map
            .get("contextRef")
            .cloned()
            .or_else(|| attr_map.get("contextref").cloned())
        else {
            continue;
        };
@@ -870,7 +945,10 @@ fn parse_xbrl_instance(raw: &str, source_file: Option<String>) -> ParsedInstance
            .cloned()
            .unwrap_or_else(|| format!("urn:unknown:{prefix}"));
        let context = context_by_id.get(&context_id);
-        let unit_ref = attr_map.get("unitRef").cloned().or_else(|| attr_map.get("unitref").cloned());
+        let unit_ref = attr_map
            .get("unitRef")
            .cloned()
            .or_else(|| attr_map.get("unitref").cloned());
        let unit = unit_ref
            .as_ref()
            .and_then(|unit_ref| unit_by_id.get(unit_ref))
@@ -896,8 +974,12 @@ fn parse_xbrl_instance(raw: &str, source_file: Option<String>) -> ParsedInstance
            period_start: context.and_then(|value| value.period_start.clone()),
            period_end: context.and_then(|value| value.period_end.clone()),
            period_instant: context.and_then(|value| value.period_instant.clone()),
-            dimensions: context.map(|value| value.dimensions.clone()).unwrap_or_default(),
+            dimensions: context
-            is_dimensionless: context.map(|value| value.dimensions.is_empty()).unwrap_or(true),
+                .map(|value| value.dimensions.clone())
                .unwrap_or_default(),
            is_dimensionless: context
                .map(|value| value.dimensions.is_empty())
                .unwrap_or(true),
            source_file: source_file.clone(),
        });
    }
@@ -916,10 +998,7 @@ fn parse_xbrl_instance(raw: &str, source_file: Option<String>) -> ParsedInstance
        })
        .collect::<Vec<_>>();
-    ParsedInstance {
+    ParsedInstance { contexts, facts }
        contexts,
        facts,
    }
 }
 fn parse_namespace_map(raw: &str, root_tag_hint: &str) -> HashMap<String, String> {
@@ -935,7 +1014,10 @@ fn parse_namespace_map(raw: &str, root_tag_hint: &str) -> HashMap<String, String
        .captures_iter(&root_start)
    {
        if let (Some(prefix), Some(uri)) = (captures.get(1), captures.get(2)) {
-            map.insert(prefix.as_str().trim().to_string(), uri.as_str().trim().to_string());
+            map.insert(
                prefix.as_str().trim().to_string(),
                uri.as_str().trim().to_string(),
            );
        }
    }
@@ -946,16 +1028,26 @@ fn parse_contexts(raw: &str) -> HashMap<String, ParsedContext> {
    let mut contexts = HashMap::new();
    for captures in CONTEXT_RE.captures_iter(raw) {
-        let Some(context_id) = captures.get(1).map(|value| value.as_str().trim().to_string()) else {
+        let Some(context_id) = captures
            .get(1)
            .map(|value| value.as_str().trim().to_string())
        else {
            continue;
        };
-        let block = captures.get(2).map(|value| value.as_str()).unwrap_or_default();
+        let block = captures
            .get(2)
            .map(|value| value.as_str())
            .unwrap_or_default();
        let (entity_identifier, entity_scheme) = IDENTIFIER_RE
            .captures(block)
            .map(|captures| {
                (
-                    captures.get(2).map(|value| decode_xml_entities(value.as_str().trim())),
+                    captures
-                    captures.get(1).map(|value| decode_xml_entities(value.as_str().trim())),
+                        .get(2)
                        .map(|value| decode_xml_entities(value.as_str().trim())),
                    captures
                        .get(1)
                        .map(|value| decode_xml_entities(value.as_str().trim())),
                )
            })
            .unwrap_or((None, None));
@@ -984,7 +1076,10 @@ fn parse_contexts(raw: &str) -> HashMap<String, ParsedContext> {
        let mut dimensions = Vec::new();
        if let Some(segment_value) = segment.as_ref() {
-            if let Some(members) = segment_value.get("explicitMembers").and_then(|value| value.as_array()) {
+            if let Some(members) = segment_value
                .get("explicitMembers")
                .and_then(|value| value.as_array())
            {
                for member in members {
                    if let (Some(axis), Some(member_value)) = (
                        member.get("axis").and_then(|value| value.as_str()),
@@ -999,7 +1094,10 @@ fn parse_contexts(raw: &str) -> HashMap<String, ParsedContext> {
            }
        }
        if let Some(scenario_value) = scenario.as_ref() {
-            if let Some(members) = scenario_value.get("explicitMembers").and_then(|value| value.as_array()) {
+            if let Some(members) = scenario_value
                .get("explicitMembers")
                .and_then(|value| value.as_array())
            {
                for member in members {
                    if let (Some(axis), Some(member_value)) = (
                        member.get("axis").and_then(|value| value.as_str()),
@@ -1062,10 +1160,16 @@ fn parse_dimension_container(raw: &str) -> serde_json::Value {
 fn parse_units(raw: &str) -> HashMap<String, ParsedUnit> {
    let mut units = HashMap::new();
    for captures in UNIT_RE.captures_iter(raw) {
-        let Some(id) = captures.get(1).map(|value| value.as_str().trim().to_string()) else {
+        let Some(id) = captures
            .get(1)
            .map(|value| value.as_str().trim().to_string())
        else {
            continue;
        };
-        let block = captures.get(2).map(|value| value.as_str()).unwrap_or_default();
+        let block = captures
            .get(2)
            .map(|value| value.as_str())
            .unwrap_or_default();
        let measures = MEASURE_RE
            .captures_iter(block)
            .filter_map(|captures| captures.get(1))
@@ -1097,7 +1201,10 @@ fn parse_attrs(raw: &str) -> HashMap<String, String> {
    let mut map = HashMap::new();
    for captures in ATTR_RE.captures_iter(raw) {
        if let (Some(name), Some(value)) = (captures.get(1), captures.get(2)) {
-            map.insert(name.as_str().to_string(), decode_xml_entities(value.as_str()));
+            map.insert(
                name.as_str().to_string(),
                decode_xml_entities(value.as_str()),
            );
        }
    }
    map
@@ -1138,12 +1245,20 @@ fn parse_label_linkbase(raw: &str) -> HashMap<String, String> {
    let mut preferred = HashMap::<String, (String, i64)>::new();
    for captures in LABEL_LINK_RE.captures_iter(raw) {
-        let block = captures.get(1).map(|value| value.as_str()).unwrap_or_default();
+        let block = captures
            .get(1)
            .map(|value| value.as_str())
            .unwrap_or_default();
        let mut loc_by_label = HashMap::<String, String>::new();
        let mut resource_by_label = HashMap::<String, (String, Option<String>)>::new();
        for captures in LOC_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
                captures
                    .get(1)
                    .map(|value| value.as_str())
                    .unwrap_or_default(),
            );
            let Some(label) = attrs.get("xlink:label").cloned() else {
                continue;
            };
@@ -1160,11 +1275,21 @@ fn parse_label_linkbase(raw: &str) -> HashMap<String, String> {
        }
        for captures in LABEL_RESOURCE_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
                captures
                    .get(1)
                    .map(|value| value.as_str())
                    .unwrap_or_default(),
            );
            let Some(label) = attrs.get("xlink:label").cloned() else {
                continue;
            };
-            let body = decode_xml_entities(captures.get(2).map(|value| value.as_str()).unwrap_or_default())
+            let body = decode_xml_entities(
                captures
                    .get(2)
                    .map(|value| value.as_str())
                    .unwrap_or_default(),
            )
            .split_whitespace()
            .collect::<Vec<_>>()
            .join(" ");
@@ -1175,7 +1300,12 @@ fn parse_label_linkbase(raw: &str) -> HashMap<String, String> {
        }
        for captures in LABEL_ARC_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
                captures
                    .get(1)
                    .map(|value| value.as_str())
                    .unwrap_or_default(),
            );
            let Some(from) = attrs.get("xlink:from").cloned() else {
                continue;
            };
@@ -1190,7 +1320,11 @@ fn parse_label_linkbase(raw: &str) -> HashMap<String, String> {
            };
            let priority = label_priority(role.as_deref());
            let current = preferred.get(concept_key).cloned();
-            if current.as_ref().map(|(_, current_priority)| priority > *current_priority).unwrap_or(true) {
+            if current
                .as_ref()
                .map(|(_, current_priority)| priority > *current_priority)
                .unwrap_or(true)
            {
                preferred.insert(concept_key.clone(), (label.clone(), priority));
            }
        }
@@ -1207,18 +1341,31 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
    let mut rows = Vec::new();
    for captures in PRESENTATION_LINK_RE.captures_iter(raw) {
-        let link_attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+        let link_attrs = parse_attrs(
            captures
                .get(1)
                .map(|value| value.as_str())
                .unwrap_or_default(),
        );
        let Some(role_uri) = link_attrs.get("xlink:role").cloned() else {
            continue;
        };
-        let block = captures.get(2).map(|value| value.as_str()).unwrap_or_default();
+        let block = captures
            .get(2)
            .map(|value| value.as_str())
            .unwrap_or_default();
        let mut loc_by_label = HashMap::<String, (String, String, bool)>::new();
        let mut children_by_label = HashMap::<String, Vec<(String, f64)>>::new();
        let mut incoming = HashSet::<String>::new();
        let mut all_referenced = HashSet::<String>::new();
        for captures in LOC_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
                captures
                    .get(1)
                    .map(|value| value.as_str())
                    .unwrap_or_default(),
            );
            let Some(label) = attrs.get("xlink:label").cloned() else {
                continue;
            };
@@ -1228,14 +1375,27 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
            let Some(qname) = qname_from_href(&href) else {
                continue;
            };
-            let Some((concept_key, qname, local_name)) = concept_from_qname(&qname, &namespaces) else {
+            let Some((concept_key, qname, local_name)) = concept_from_qname(&qname, &namespaces)
            else {
                continue;
            };
-            loc_by_label.insert(label, (concept_key, qname, local_name.to_ascii_lowercase().contains("abstract")));
+            loc_by_label.insert(
                label,
                (
                    concept_key,
                    qname,
                    local_name.to_ascii_lowercase().contains("abstract"),
                ),
            );
        }
        for captures in PRESENTATION_ARC_RE.captures_iter(block) {
-            let attrs = parse_attrs(captures.get(1).map(|value| value.as_str()).unwrap_or_default());
+            let attrs = parse_attrs(
                captures
                    .get(1)
                    .map(|value| value.as_str())
                    .unwrap_or_default(),
            );
            let Some(from) = attrs.get("xlink:from").cloned() else {
                continue;
            };
@@ -1248,8 +1408,16 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
            let order = attrs
                .get("order")
                .and_then(|value| value.parse::<f64>().ok())
-                .unwrap_or_else(|| children_by_label.get(&from).map(|children| children.len() as f64 + 1.0).unwrap_or(1.0));
+                .unwrap_or_else(|| {
-            children_by_label.entry(from.clone()).or_default().push((to.clone(), order));
+                    children_by_label
                        .get(&from)
                        .map(|children| children.len() as f64 + 1.0)
                        .unwrap_or(1.0)
                });
            children_by_label
                .entry(from.clone())
                .or_default()
                .push((to.clone(), order));
            incoming.insert(to.clone());
            all_referenced.insert(from);
            all_referenced.insert(to);
@@ -1281,7 +1449,11 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
                return;
            }
-            let parent_concept_key = parent_label.and_then(|parent| loc_by_label.get(parent).map(|(concept_key, _, _)| concept_key.clone()));
+            let parent_concept_key = parent_label.and_then(|parent| {
                loc_by_label
                    .get(parent)
                    .map(|(concept_key, _, _)| concept_key.clone())
            });
            rows.push(PresentationNode {
                concept_key: concept_key.clone(),
                role_uri: role_uri.to_string(),
@@ -1292,7 +1464,11 @@ fn parse_presentation_linkbase(raw: &str) -> Vec<PresentationNode> {
            });
            let mut children = children_by_label.get(label).cloned().unwrap_or_default();
-            children.sort_by(|left, right| left.1.partial_cmp(&right.1).unwrap_or(std::cmp::Ordering::Equal));
+            children.sort_by(|left, right| {
                left.1
                    .partial_cmp(&right.1)
                    .unwrap_or(std::cmp::Ordering::Equal)
            });
            for (index, (child_label, _)) in children.into_iter().enumerate() {
                dfs(
                    &child_label,
@@ -1400,7 +1576,10 @@ fn materialize_taxonomy_statements(
            .clone()
            .or_else(|| fact.period_instant.clone())
            .unwrap_or_else(|| filing_date.to_string());
-        let id = format!("{date}-{compact_accession}-{}", period_by_signature.len() + 1);
+        let id = format!(
            "{date}-{compact_accession}-{}",
            period_by_signature.len() + 1
        );
        let period_label = if fact.period_instant.is_some() && fact.period_start.is_none() {
            "Instant".to_string()
        } else if fact.period_start.is_some() && fact.period_end.is_some() {
@@ -1420,7 +1599,10 @@ fn materialize_taxonomy_statements(
                accession_number: accession_number.to_string(),
                filing_date: filing_date.to_string(),
                period_start: fact.period_start.clone(),
-                period_end: fact.period_end.clone().or_else(|| fact.period_instant.clone()),
+                period_end: fact
                    .period_end
                    .clone()
                    .or_else(|| fact.period_instant.clone()),
                filing_type: filing_type.to_string(),
                period_label,
            },
@@ -1429,9 +1611,17 @@ fn materialize_taxonomy_statements(
    let mut periods = period_by_signature.values().cloned().collect::<Vec<_>>();
    periods.sort_by(|left, right| {
-        let left_key = left.period_end.clone().unwrap_or_else(|| left.filing_date.clone());
+        let left_key = left
-        let right_key = right.period_end.clone().unwrap_or_else(|| right.filing_date.clone());
+            .period_end
-        left_key.cmp(&right_key).then_with(|| left.id.cmp(&right.id))
+            .clone()
            .unwrap_or_else(|| left.filing_date.clone());
        let right_key = right
            .period_end
            .clone()
            .unwrap_or_else(|| right.filing_date.clone());
        left_key
            .cmp(&right_key)
            .then_with(|| left.id.cmp(&right.id))
    });
    let period_id_by_signature = period_by_signature
        .iter()
@@ -1440,7 +1630,10 @@ fn materialize_taxonomy_statements(
    let mut presentation_by_concept = HashMap::<String, Vec<&PresentationNode>>::new();
    for node in presentation {
-        presentation_by_concept.entry(node.concept_key.clone()).or_default().push(node);
+        presentation_by_concept
            .entry(node.concept_key.clone())
            .or_default()
            .push(node);
    }
    let mut grouped_by_statement = empty_parsed_fact_map();
@@ -1502,9 +1695,13 @@ fn materialize_taxonomy_statements(
    let mut concepts = Vec::<ConceptOutput>::new();
    for statement_kind in statement_keys() {
-        let concept_groups = grouped_by_statement.remove(statement_kind).unwrap_or_default();
+        let concept_groups = grouped_by_statement
            .remove(statement_kind)
            .unwrap_or_default();
        let mut concept_keys = HashSet::<String>::new();
-        for node in presentation.iter().filter(|node| classify_statement_role(&node.role_uri).as_deref() == Some(statement_kind)) {
+        for node in presentation.iter().filter(|node| {
            classify_statement_role(&node.role_uri).as_deref() == Some(statement_kind)
        }) {
            concept_keys.insert(node.concept_key.clone());
        }
        for concept_key in concept_groups.keys() {
@@ -1516,12 +1713,21 @@ fn materialize_taxonomy_statements(
            .map(|concept_key| {
                let nodes = presentation
                    .iter()
-                    .filter(|node| node.concept_key == concept_key && classify_statement_role(&node.role_uri).as_deref() == Some(statement_kind))
+                    .filter(|node| {
                        node.concept_key == concept_key
                            && classify_statement_role(&node.role_uri).as_deref()
                                == Some(statement_kind)
                    })
                    .collect::<Vec<_>>();
-                let order = nodes.iter().map(|node| node.order).fold(f64::INFINITY, f64::min);
+                let order = nodes
                    .iter()
                    .map(|node| node.order)
                    .fold(f64::INFINITY, f64::min);
                let depth = nodes.iter().map(|node| node.depth).min().unwrap_or(0);
                let role_uri = nodes.first().map(|node| node.role_uri.clone());
-                let parent_concept_key = nodes.first().and_then(|node| node.parent_concept_key.clone());
+                let parent_concept_key = nodes
                    .first()
                    .and_then(|node| node.parent_concept_key.clone());
                (concept_key, order, depth, role_uri, parent_concept_key)
            })
            .collect::<Vec<_>>();
@@ -1532,8 +1738,13 @@ fn materialize_taxonomy_statements(
                .then_with(|| left.0.cmp(&right.0))
        });
-        for (concept_key, presentation_order, depth, role_uri, parent_concept_key) in ordered_concepts {
+        for (concept_key, presentation_order, depth, role_uri, parent_concept_key) in
-            let fact_group = concept_groups.get(&concept_key).cloned().unwrap_or_default();
+            ordered_concepts
        {
            let fact_group = concept_groups
                .get(&concept_key)
                .cloned()
                .unwrap_or_default();
            let (namespace_uri, local_name) = split_concept_key(&concept_key);
            let qname = fact_group
                .first()
@@ -1672,7 +1883,13 @@ fn empty_detail_row_map() -> DetailRowStatementMap {
 }
 fn statement_keys() -> [&'static str; 5] {
-    ["income", "balance", "cash_flow", "equity", "comprehensive_income"]
+    [
        "income",
        "balance",
        "cash_flow",
        "equity",
        "comprehensive_income",
    ]
 }
 fn statement_key_ref(value: &str) -> Option<&'static str> {
@@ -1709,7 +1926,13 @@ fn pick_preferred_fact(grouped_facts: &[(i64, ParsedFact)]) -> Option<&(i64, Par
                    .unwrap_or_default();
                left_date.cmp(&right_date)
            })
-            .then_with(|| left.1.value.abs().partial_cmp(&right.1.value.abs()).unwrap_or(std::cmp::Ordering::Equal))
+            .then_with(|| {
                left.1
                    .value
                    .abs()
                    .partial_cmp(&right.1.value.abs())
                    .unwrap_or(std::cmp::Ordering::Equal)
            })
    })
 }
@@ -1779,12 +2002,6 @@ fn classify_statement_role(role_uri: &str) -> Option<String> {
 fn concept_statement_fallback(local_name: &str) -> Option<String> {
    let normalized = local_name.to_ascii_lowercase();
    if Regex::new(r#"cash|operatingactivities|investingactivities|financingactivities"#)
        .unwrap()
        .is_match(&normalized)
    {
        return Some("cash_flow".to_string());
    }
    if Regex::new(r#"equity|retainedearnings|additionalpaidincapital"#)
        .unwrap()
        .is_match(&normalized)
@@ -1794,6 +2011,22 @@ fn concept_statement_fallback(local_name: &str) -> Option<String> {
    if normalized.contains("comprehensiveincome") {
        return Some("comprehensive_income".to_string());
    }
    if Regex::new(
        r#"deferredpolicyacquisitioncosts(andvalueofbusinessacquired)?$|supplementaryinsuranceinformationdeferredpolicyacquisitioncosts$|deferredacquisitioncosts$"#,
    )
    .unwrap()
    .is_match(&normalized)
    {
        return Some("balance".to_string());
    }
    if Regex::new(
        r#"netcashprovidedbyusedin.*activities|increasedecreasein|paymentstoacquire|paymentsforcapitalimprovements$|paymentsfordepositsonrealestateacquisitions$|paymentsforrepurchase|paymentsofdividends|dividendscommonstockcash$|proceedsfrom|repaymentsofdebt|sharebasedcompensation$|allocatedsharebasedcompensationexpense$|depreciationdepletionandamortization$|depreciationamortizationandaccretionnet$|depreciationandamortization$|depreciationamortizationandother$|otheradjustmentstoreconcilenetincomelosstocashprovidedbyusedinoperatingactivities"#,
    )
    .unwrap()
    .is_match(&normalized)
    {
        return Some("cash_flow".to_string());
    }
    if Regex::new(
        r#"asset|liabilit|debt|financingreceivable|loansreceivable|deposits|allowanceforcreditloss|futurepolicybenefits|policyholderaccountbalances|unearnedpremiums|realestateinvestmentproperty|grossatcarryingvalue|investmentproperty"#,
    )
@@ -1967,7 +2200,10 @@ mod tests {
            vec![],
        )
        .expect("core pack should load and map");
-        let income_surface_rows = model.surface_rows.get("income").expect("income surface rows");
+        let income_surface_rows = model
            .surface_rows
            .get("income")
            .expect("income surface rows");
        let op_expenses = income_surface_rows
            .iter()
            .find(|row| row.key == "operating_expenses")
@@ -1978,7 +2214,10 @@ mod tests {
            .expect("revenue surface row");
        assert_eq!(revenue.values.get("2025").copied().flatten(), Some(120.0));
-        assert_eq!(op_expenses.values.get("2024").copied().flatten(), Some(40.0));
+        assert_eq!(
            op_expenses.values.get("2024").copied().flatten(),
            Some(40.0)
        );
        assert_eq!(op_expenses.detail_count, Some(2));
        let operating_expense_details = model
@@ -1987,8 +2226,12 @@ mod tests {
            .and_then(|groups| groups.get("operating_expenses"))
            .expect("operating expenses details");
        assert_eq!(operating_expense_details.len(), 2);
-        assert!(operating_expense_details.iter().any(|row| row.key == "sga-row"));
+        assert!(operating_expense_details
-        assert!(operating_expense_details.iter().any(|row| row.key == "rd-row"));
+            .iter()
            .any(|row| row.key == "sga-row"));
        assert!(operating_expense_details
            .iter()
            .any(|row| row.key == "rd-row"));
        let residual_rows = model
            .detail_rows
@@ -2003,17 +2246,26 @@ mod tests {
            .concept_mappings
            .get("http://fasb.org/us-gaap/2024#ResearchAndDevelopmentExpense")
            .expect("rd mapping");
-        assert_eq!(rd_mapping.detail_parent_surface_key.as_deref(), Some("operating_expenses"));
+        assert_eq!(
-        assert_eq!(rd_mapping.surface_key.as_deref(), Some("operating_expenses"));
+            rd_mapping.detail_parent_surface_key.as_deref(),
            Some("operating_expenses")
        );
        assert_eq!(
            rd_mapping.surface_key.as_deref(),
            Some("operating_expenses")
        );
        let residual_mapping = model
            .concept_mappings
            .get("urn:company#OtherOperatingCharges")
            .expect("residual mapping");
        assert!(residual_mapping.residual_flag);
-        assert_eq!(residual_mapping.detail_parent_surface_key.as_deref(), Some("unmapped"));
+        assert_eq!(
            residual_mapping.detail_parent_surface_key.as_deref(),
            Some("unmapped")
        );
-        assert_eq!(model.normalization_summary.surface_row_count, 5);
+        assert_eq!(model.normalization_summary.surface_row_count, 6);
        assert_eq!(model.normalization_summary.detail_row_count, 3);
        assert_eq!(model.normalization_summary.unmapped_row_count, 1);
    }
@@ -2051,18 +2303,60 @@ mod tests {
    #[test]
    fn classifies_pack_specific_concepts_without_presentation_roles() {
        assert_eq!(
-            concept_statement_fallback("FinancingReceivableExcludingAccruedInterestAfterAllowanceForCreditLoss")
+            concept_statement_fallback(
                "FinancingReceivableExcludingAccruedInterestAfterAllowanceForCreditLoss"
            )
            .as_deref(),
            Some("balance")
        );
-        assert_eq!(concept_statement_fallback("Deposits").as_deref(), Some("balance"));
+        assert_eq!(
            concept_statement_fallback("Deposits").as_deref(),
            Some("balance")
        );
        assert_eq!(
            concept_statement_fallback("RealEstateInvestmentPropertyNet").as_deref(),
            Some("balance")
        );
        assert_eq!(concept_statement_fallback("LeaseIncome").as_deref(), Some("income"));
        assert_eq!(
-            concept_statement_fallback("DirectCostsOfLeasedAndRentedPropertyOrEquipment").as_deref(),
+            concept_statement_fallback("DeferredPolicyAcquisitionCosts").as_deref(),
            Some("balance")
        );
        assert_eq!(
            concept_statement_fallback("DeferredPolicyAcquisitionCostsAndValueOfBusinessAcquired")
                .as_deref(),
            Some("balance")
        );
        assert_eq!(
            concept_statement_fallback("IncreaseDecreaseInAccountsReceivable").as_deref(),
            Some("cash_flow")
        );
        assert_eq!(
            concept_statement_fallback("PaymentsOfDividends").as_deref(),
            Some("cash_flow")
        );
        assert_eq!(
            concept_statement_fallback("RepaymentsOfDebt").as_deref(),
            Some("cash_flow")
        );
        assert_eq!(
            concept_statement_fallback("ShareBasedCompensation").as_deref(),
            Some("cash_flow")
        );
        assert_eq!(
            concept_statement_fallback("PaymentsForCapitalImprovements").as_deref(),
            Some("cash_flow")
        );
        assert_eq!(
            concept_statement_fallback("PaymentsForDepositsOnRealEstateAcquisitions").as_deref(),
            Some("cash_flow")
        );
        assert_eq!(
            concept_statement_fallback("LeaseIncome").as_deref(),
            Some("income")
        );
        assert_eq!(
            concept_statement_fallback("DirectCostsOfLeasedAndRentedPropertyOrEquipment")
                .as_deref(),
            Some("income")
        );
    }
--- a/rust/fiscal-xbrl-core/src/surface_mapper.rs
+++ b/rust/fiscal-xbrl-core/src/surface_mapper.rs
--- a/rust/fiscal-xbrl-core/src/taxonomy_loader.rs
+++ b/rust/fiscal-xbrl-core/src/taxonomy_loader.rs
@@ -1,12 +1,22 @@
 use anyhow::{anyhow, Context, Result};
 use serde::Deserialize;
 use std::collections::HashMap;
 use std::env;
 use std::fs;
 use std::collections::HashMap;
 use std::path::PathBuf;
 use crate::pack_selector::FiscalPack;
 fn default_include_in_output() -> bool {
    true
 }
 #[derive(Debug, Deserialize, Clone, Copy, PartialEq, Eq)]
 #[serde(rename_all = "snake_case")]
 pub enum SurfaceSignTransform {
    Invert,
 }
 #[derive(Debug, Deserialize, Clone)]
 pub struct SurfacePackFile {
    pub version: String,
@@ -25,9 +35,44 @@ pub struct SurfaceDefinition {
    pub rollup_policy: String,
    pub allowed_source_concepts: Vec<String>,
    pub allowed_authoritative_concepts: Vec<String>,
-    pub formula_fallback: Option<serde_json::Value>,
+    pub formula_fallback: Option<SurfaceFormulaFallback>,
    pub detail_grouping_policy: String,
    pub materiality_policy: String,
    #[serde(default = "default_include_in_output")]
    pub include_in_output: bool,
    #[serde(default)]
    pub sign_transform: Option<SurfaceSignTransform>,
 }
 #[derive(Debug, Deserialize, Clone)]
 #[serde(untagged)]
 pub enum SurfaceFormulaFallback {
    LegacyString(#[allow(dead_code)] String),
    Structured(SurfaceFormula),
 }
 impl SurfaceFormulaFallback {
    pub fn structured(&self) -> Option<&SurfaceFormula> {
        match self {
            Self::Structured(formula) => Some(formula),
            Self::LegacyString(_) => None,
        }
    }
 }
 #[derive(Debug, Deserialize, Clone)]
 pub struct SurfaceFormula {
    pub op: SurfaceFormulaOp,
    pub sources: Vec<String>,
    #[serde(default)]
    pub treat_null_as_zero: bool,
 }
 #[derive(Debug, Deserialize, Clone, Copy, PartialEq, Eq)]
 #[serde(rename_all = "snake_case")]
 pub enum SurfaceFormulaOp {
    Sum,
    Subtract,
 }
 #[derive(Debug, Deserialize, Clone)]
@@ -147,7 +192,9 @@ pub fn resolve_taxonomy_dir() -> Result<PathBuf> {
    candidates
        .into_iter()
        .find(|path| path.is_dir())
-        .ok_or_else(|| anyhow!("taxonomy resolution failed: unable to locate runtime taxonomy directory"))
+        .ok_or_else(|| {
            anyhow!("taxonomy resolution failed: unable to locate runtime taxonomy directory")
        })
 }
 pub fn load_surface_pack(pack: FiscalPack) -> Result<SurfacePackFile> {
@@ -156,14 +203,52 @@ pub fn load_surface_pack(pack: FiscalPack) -> Result<SurfacePackFile> {
        .join("fiscal")
        .join("v1")
        .join(format!("{}.surface.json", pack.as_str()));
-    let raw = fs::read_to_string(&path)
+    let mut file = load_surface_pack_file(&path)?;
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
+
-    let file = serde_json::from_str::<SurfacePackFile>(&raw)
+    if !matches!(pack, FiscalPack::Core) {
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+        let core_path = taxonomy_dir
            .join("fiscal")
            .join("v1")
            .join("core.surface.json");
        let core_file = load_surface_pack_file(&core_path)?;
        let pack_inherited_keys = file
            .surfaces
            .iter()
            .filter(|surface| surface.statement == "balance" || surface.statement == "cash_flow")
            .map(|surface| (surface.statement.clone(), surface.surface_key.clone()))
            .collect::<std::collections::HashSet<_>>();
        file.surfaces.extend(
            core_file
                .surfaces
                .into_iter()
                .filter(|surface| surface.statement == "balance" || surface.statement == "cash_flow")
                .filter(|surface| {
                    !pack_inherited_keys
                        .contains(&(surface.statement.clone(), surface.surface_key.clone()))
                }),
        );
    }
    let _ = (&file.version, &file.pack);
    Ok(file)
 }
 fn load_surface_pack_file(path: &PathBuf) -> Result<SurfacePackFile> {
    let raw = fs::read_to_string(path).with_context(|| {
        format!(
            "taxonomy resolution failed: unable to read {}",
            path.display()
        )
    })?;
    serde_json::from_str::<SurfacePackFile>(&raw).with_context(|| {
        format!(
            "taxonomy resolution failed: unable to parse {}",
            path.display()
        )
    })
 }
 pub fn load_crosswalk(regime: &str) -> Result<Option<CrosswalkFile>> {
    let file_name = match regime {
        "us-gaap" => "us-gaap.json",
@@ -173,10 +258,18 @@ pub fn load_crosswalk(regime: &str) -> Result<Option<CrosswalkFile>> {
    let taxonomy_dir = resolve_taxonomy_dir()?;
    let path = taxonomy_dir.join("crosswalk").join(file_name);
-    let raw = fs::read_to_string(&path)
+    let raw = fs::read_to_string(&path).with_context(|| {
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
+        format!(
-    let file = serde_json::from_str::<CrosswalkFile>(&raw)
+            "taxonomy resolution failed: unable to read {}",
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+            path.display()
        )
    })?;
    let file = serde_json::from_str::<CrosswalkFile>(&raw).with_context(|| {
        format!(
            "taxonomy resolution failed: unable to parse {}",
            path.display()
        )
    })?;
    let _ = (&file.version, &file.regime);
    Ok(Some(file))
 }
@@ -188,10 +281,18 @@ pub fn load_kpi_pack(pack: FiscalPack) -> Result<KpiPackFile> {
        .join("v1")
        .join("kpis")
        .join(format!("{}.kpis.json", pack.as_str()));
-    let raw = fs::read_to_string(&path)
+    let raw = fs::read_to_string(&path).with_context(|| {
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
+        format!(
-    let file = serde_json::from_str::<KpiPackFile>(&raw)
+            "taxonomy resolution failed: unable to read {}",
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+            path.display()
        )
    })?;
    let file = serde_json::from_str::<KpiPackFile>(&raw).with_context(|| {
        format!(
            "taxonomy resolution failed: unable to parse {}",
            path.display()
        )
    })?;
    let _ = (&file.version, &file.pack);
    Ok(file)
 }
@@ -202,10 +303,18 @@ pub fn load_universal_income_definitions() -> Result<UniversalIncomeFile> {
        .join("fiscal")
        .join("v1")
        .join("universal_income.surface.json");
-    let raw = fs::read_to_string(&path)
+    let raw = fs::read_to_string(&path).with_context(|| {
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
+        format!(
-    let file = serde_json::from_str::<UniversalIncomeFile>(&raw)
+            "taxonomy resolution failed: unable to read {}",
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+            path.display()
        )
    })?;
    let file = serde_json::from_str::<UniversalIncomeFile>(&raw).with_context(|| {
        format!(
            "taxonomy resolution failed: unable to parse {}",
            path.display()
        )
    })?;
    let _ = &file.version;
    Ok(file)
 }
@@ -216,10 +325,18 @@ pub fn load_income_bridge(pack: FiscalPack) -> Result<IncomeBridgeFile> {
        .join("fiscal")
        .join("v1")
        .join(format!("{}.income-bridge.json", pack.as_str()));
-    let raw = fs::read_to_string(&path)
+    let raw = fs::read_to_string(&path).with_context(|| {
-        .with_context(|| format!("taxonomy resolution failed: unable to read {}", path.display()))?;
+        format!(
-    let file = serde_json::from_str::<IncomeBridgeFile>(&raw)
+            "taxonomy resolution failed: unable to read {}",
-        .with_context(|| format!("taxonomy resolution failed: unable to parse {}", path.display()))?;
+            path.display()
        )
    })?;
    let file = serde_json::from_str::<IncomeBridgeFile>(&raw).with_context(|| {
        format!(
            "taxonomy resolution failed: unable to parse {}",
            path.display()
        )
    })?;
    let _ = (&file.version, &file.pack);
    Ok(file)
 }
@@ -230,17 +347,20 @@ mod tests {
    #[test]
    fn resolves_taxonomy_dir_and_loads_core_pack() {
-        let taxonomy_dir = resolve_taxonomy_dir().expect("taxonomy dir should resolve during tests");
+        let taxonomy_dir =
            resolve_taxonomy_dir().expect("taxonomy dir should resolve during tests");
        assert!(taxonomy_dir.exists());
-        let surface_pack = load_surface_pack(FiscalPack::Core).expect("core surface pack should load");
+        let surface_pack =
            load_surface_pack(FiscalPack::Core).expect("core surface pack should load");
        assert_eq!(surface_pack.pack, "core");
        assert!(!surface_pack.surfaces.is_empty());
        let kpi_pack = load_kpi_pack(FiscalPack::Core).expect("core kpi pack should load");
        assert_eq!(kpi_pack.pack, "core");
-        let universal_income = load_universal_income_definitions().expect("universal income config should load");
+        let universal_income =
            load_universal_income_definitions().expect("universal income config should load");
        assert!(!universal_income.rows.is_empty());
        let core_bridge = load_income_bridge(FiscalPack::Core).expect("core bridge should load");
--- a/rust/fiscal-xbrl-core/src/universal_income.rs
+++ b/rust/fiscal-xbrl-core/src/universal_income.rs
--- a/rust/taxonomy/fiscal/v1/bank_lender.surface.json
+++ b/rust/taxonomy/fiscal/v1/bank_lender.surface.json
@@ -156,7 +156,7 @@
      "surface_key": "loans",
      "statement": "balance",
      "label": "Loans",
-      "category": "surface",
+      "category": "noncurrent_assets",
      "order": 30,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -181,7 +181,7 @@
      "surface_key": "allowance_for_credit_losses",
      "statement": "balance",
      "label": "Allowance for Credit Losses",
-      "category": "surface",
+      "category": "noncurrent_assets",
      "order": 40,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -201,7 +201,7 @@
      "surface_key": "deposits",
      "statement": "balance",
      "label": "Deposits",
-      "category": "surface",
+      "category": "current_liabilities",
      "order": 80,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -215,7 +215,7 @@
      "surface_key": "total_assets",
      "statement": "balance",
      "label": "Total Assets",
-      "category": "surface",
+      "category": "derived",
      "order": 90,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -229,7 +229,7 @@
      "surface_key": "total_liabilities",
      "statement": "balance",
      "label": "Total Liabilities",
-      "category": "surface",
+      "category": "derived",
      "order": 100,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -243,7 +243,7 @@
      "surface_key": "total_equity",
      "statement": "balance",
      "label": "Total Equity",
-      "category": "surface",
+      "category": "equity",
      "order": 110,
      "unit": "currency",
      "rollup_policy": "direct_only",
--- a/rust/taxonomy/fiscal/v1/broker_asset_manager.surface.json
+++ b/rust/taxonomy/fiscal/v1/broker_asset_manager.surface.json
@@ -63,7 +63,7 @@
      "surface_key": "total_assets",
      "statement": "balance",
      "label": "Total Assets",
-      "category": "surface",
+      "category": "derived",
      "order": 90,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -77,7 +77,7 @@
      "surface_key": "total_liabilities",
      "statement": "balance",
      "label": "Total Liabilities",
-      "category": "surface",
+      "category": "derived",
      "order": 100,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -91,7 +91,7 @@
      "surface_key": "total_equity",
      "statement": "balance",
      "label": "Total Equity",
-      "category": "surface",
+      "category": "equity",
      "order": 110,
      "unit": "currency",
      "rollup_policy": "direct_only",
--- a/rust/taxonomy/fiscal/v1/core.surface.json
+++ b/rust/taxonomy/fiscal/v1/core.surface.json
--- a/rust/taxonomy/fiscal/v1/insurance.surface.json
+++ b/rust/taxonomy/fiscal/v1/insurance.surface.json
@@ -119,7 +119,7 @@
      "surface_key": "policy_liabilities",
      "statement": "balance",
      "label": "Policy Liabilities",
-      "category": "surface",
+      "category": "noncurrent_liabilities",
      "order": 80,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -145,17 +145,19 @@
      "surface_key": "deferred_acquisition_costs",
      "statement": "balance",
      "label": "Deferred Acquisition Costs",
-      "category": "surface",
+      "category": "noncurrent_assets",
      "order": 90,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
      "allowed_source_concepts": [
        "us-gaap:DeferredPolicyAcquisitionCosts",
-        "us-gaap:DeferredAcquisitionCosts"
+        "us-gaap:DeferredAcquisitionCosts",
        "us-gaap:DeferredPolicyAcquisitionCostsAndValueOfBusinessAcquired"
      ],
      "allowed_authoritative_concepts": [
        "us-gaap:DeferredPolicyAcquisitionCosts",
-        "us-gaap:DeferredAcquisitionCosts"
+        "us-gaap:DeferredAcquisitionCosts",
        "us-gaap:DeferredPolicyAcquisitionCostsAndValueOfBusinessAcquired"
      ],
      "formula_fallback": null,
      "detail_grouping_policy": "group_all_children",
@@ -165,7 +167,7 @@
      "surface_key": "total_assets",
      "statement": "balance",
      "label": "Total Assets",
-      "category": "surface",
+      "category": "derived",
      "order": 100,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -179,7 +181,7 @@
      "surface_key": "total_liabilities",
      "statement": "balance",
      "label": "Total Liabilities",
-      "category": "surface",
+      "category": "derived",
      "order": 110,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -193,7 +195,7 @@
      "surface_key": "total_equity",
      "statement": "balance",
      "label": "Total Equity",
-      "category": "surface",
+      "category": "equity",
      "order": 120,
      "unit": "currency",
      "rollup_policy": "direct_only",
--- a/rust/taxonomy/fiscal/v1/reit_real_estate.surface.json
+++ b/rust/taxonomy/fiscal/v1/reit_real_estate.surface.json
@@ -78,7 +78,7 @@
      "surface_key": "investment_property",
      "statement": "balance",
      "label": "Investment Property",
-      "category": "surface",
+      "category": "noncurrent_assets",
      "order": 40,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
@@ -99,7 +99,7 @@
      "surface_key": "total_assets",
      "statement": "balance",
      "label": "Total Assets",
-      "category": "surface",
+      "category": "derived",
      "order": 90,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -113,7 +113,7 @@
      "surface_key": "total_liabilities",
      "statement": "balance",
      "label": "Total Liabilities",
-      "category": "surface",
+      "category": "derived",
      "order": 100,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -127,7 +127,7 @@
      "surface_key": "total_equity",
      "statement": "balance",
      "label": "Total Equity",
-      "category": "surface",
+      "category": "equity",
      "order": 110,
      "unit": "currency",
      "rollup_policy": "direct_only",
@@ -136,6 +136,25 @@
      "formula_fallback": null,
      "detail_grouping_policy": "top_level_only",
      "materiality_policy": "balance_default"
    },
    {
      "surface_key": "capital_expenditures",
      "statement": "cash_flow",
      "label": "Capital Expenditures",
      "category": "investing",
      "order": 130,
      "unit": "currency",
      "rollup_policy": "aggregate_children",
      "allowed_source_concepts": [
        "us-gaap:PaymentsToAcquireCommercialRealEstate",
        "us-gaap:PaymentsForCapitalImprovements",
        "us-gaap:PaymentsForDepositsOnRealEstateAcquisitions"
      ],
      "allowed_authoritative_concepts": [],
      "formula_fallback": null,
      "detail_grouping_policy": "group_all_children",
      "materiality_policy": "cash_flow_default",
      "sign_transform": "invert"
    }
  ]
 }
--- a/scripts/compare-fiscal-ai-statements.ts
+++ b/scripts/compare-fiscal-ai-statements.ts
@@ -5,7 +5,7 @@ import { hydrateFilingTaxonomySnapshot } from '@/lib/server/taxonomy/engine';
 import type { TaxonomyHydrationInput, TaxonomyHydrationResult } from '@/lib/server/taxonomy/types';
 type ComparisonTarget = {
-  statement: Extract<FinancialStatementKind, 'income' | 'balance'>;
+  statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>;
  surfaceKey: string;
  fiscalAiLabels: string[];
  allowNotMeaningful?: boolean;
@@ -46,7 +46,7 @@ type FiscalAiTable = {
 };
 type ComparisonRow = {
-  statement: Extract<FinancialStatementKind, 'income' | 'balance'>;
+  statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>;
  surfaceKey: string;
  fiscalAiLabel: string | null;
  fiscalAiValueM: number | null;
@@ -89,6 +89,11 @@ const CASES: CompanyCase[] = [
        surfaceKey: 'net_income',
        fiscalAiLabels: ['Net Income Attributable to Common Shareholders', 'Consolidated Net Income', 'Net Income']
      },
      { statement: 'balance', surfaceKey: 'current_assets', fiscalAiLabels: ['Current Assets', 'Total Current Assets'] },
      { statement: 'balance', surfaceKey: 'total_assets', fiscalAiLabels: ['Total Assets'] },
      { statement: 'cash_flow', surfaceKey: 'operating_cash_flow', fiscalAiLabels: ['Cash from Operating Activities', 'Operating Cash Flow', 'Net Cash from Operations', 'Net Cash Provided by Operating'] },
      { statement: 'cash_flow', surfaceKey: 'capital_expenditures', fiscalAiLabels: ['Capital Expenditures', 'Capital Expenditure'] },
      { statement: 'cash_flow', surfaceKey: 'free_cash_flow', fiscalAiLabels: ['Free Cash Flow', 'Levered Free Cash Flow'] },
    ]
  },
  {
@@ -113,6 +118,11 @@ const CASES: CompanyCase[] = [
        surfaceKey: 'net_income',
        fiscalAiLabels: ['Net Income to Common', 'Net Income Attributable to Common Shareholders', 'Net Income']
      },
      { statement: 'balance', surfaceKey: 'loans', fiscalAiLabels: ['Net Loans', 'Loans', 'Loans Receivable'] },
      { statement: 'balance', surfaceKey: 'total_assets', fiscalAiLabels: ['Total Assets'] },
      { statement: 'cash_flow', surfaceKey: 'operating_cash_flow', fiscalAiLabels: ['Cash from Operating Activities', 'Net Cash from Operating Activities', 'Net Cash Provided by Operating'] },
      { statement: 'cash_flow', surfaceKey: 'investing_cash_flow', fiscalAiLabels: ['Cash from Investing Activities', 'Net Cash from Investing Activities', 'Net Cash Provided by Investing'] },
      { statement: 'cash_flow', surfaceKey: 'financing_cash_flow', fiscalAiLabels: ['Cash from Financing Activities', 'Net Cash from Financing Activities', 'Net Cash Provided by Financing'] },
    ]
  },
  {
@@ -137,6 +147,18 @@ const CASES: CompanyCase[] = [
        surfaceKey: 'net_income',
        fiscalAiLabels: ['Net Income Attributable to Common Shareholders', 'Consolidated Net Income', 'Net Income']
      },
      {
        statement: 'balance',
        surfaceKey: 'deferred_acquisition_costs',
        fiscalAiLabels: [
          'Deferred Acquisition Costs',
          'Deferred Policy Acquisition Costs',
          'Deferred Policy Acquisition Costs and Value of Business Acquired'
        ]
      },
      { statement: 'balance', surfaceKey: 'total_assets', fiscalAiLabels: ['Total Assets'] },
      { statement: 'cash_flow', surfaceKey: 'operating_cash_flow', fiscalAiLabels: ['Cash from Operating Activities', 'Operating Cash Flow', 'Net Cash from Operations', 'Net Cash Provided by Operating'] },
      { statement: 'cash_flow', surfaceKey: 'free_cash_flow', fiscalAiLabels: ['Free Cash Flow', 'Levered Free Cash Flow'] },
    ]
  },
  {
@@ -154,7 +176,22 @@ const CASES: CompanyCase[] = [
        statement: 'income',
        surfaceKey: 'net_income',
        fiscalAiLabels: ['Net Income Attributable to Common Shareholders', 'Consolidated Net Income', 'Net Income']
-      }
+      },
      {
        statement: 'balance',
        surfaceKey: 'investment_property',
        fiscalAiLabels: [
          'Investment Property',
          'Investment Properties',
          'Real Estate Investment Property, Net',
          'Real Estate Investment Property, at Cost',
          'Total real estate held for investment, at cost'
        ]
      },
      { statement: 'balance', surfaceKey: 'total_assets', fiscalAiLabels: ['Total Assets'] },
      { statement: 'cash_flow', surfaceKey: 'operating_cash_flow', fiscalAiLabels: ['Cash from Operating Activities', 'Operating Cash Flow', 'Net Cash from Operations', 'Net Cash Provided by Operating'] },
      { statement: 'cash_flow', surfaceKey: 'capital_expenditures', fiscalAiLabels: ['Capital Expenditures', 'Capital Expenditure'] },
      { statement: 'cash_flow', surfaceKey: 'free_cash_flow', fiscalAiLabels: ['Free Cash Flow', 'Levered Free Cash Flow'] }
    ]
  },
  {
@@ -184,6 +221,9 @@ const CASES: CompanyCase[] = [
 ];
 function parseTickerFilter(argv: string[]) {
  let ticker: string | null = null;
  let statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'> | null = null;
  for (const arg of argv) {
    if (arg === '--help' || arg === '-h') {
      console.log('Compare live Fiscal.ai standardized statement rows against local sidecar output.');
@@ -191,16 +231,26 @@ function parseTickerFilter(argv: string[]) {
      console.log('Usage:');
      console.log('  bun run scripts/compare-fiscal-ai-statements.ts');
      console.log('  bun run scripts/compare-fiscal-ai-statements.ts --ticker=MSFT');
      console.log('  bun run scripts/compare-fiscal-ai-statements.ts --statement=balance');
      console.log('  bun run scripts/compare-fiscal-ai-statements.ts --statement=cash_flow');
      process.exit(0);
    }
    if (arg.startsWith('--ticker=')) {
      const value = arg.slice('--ticker='.length).trim().toUpperCase();
-      return value.length > 0 ? value : null;
+      ticker = value.length > 0 ? value : null;
      continue;
    }
    if (arg.startsWith('--statement=')) {
      const value = arg.slice('--statement='.length).trim().toLowerCase().replace(/-/g, '_');
      if (value === 'income' || value === 'balance' || value === 'cash_flow') {
        statement = value;
      }
    }
  }
-  return null;
+  return { ticker, statement };
 }
 function normalizeLabel(value: string) {
@@ -295,10 +345,98 @@ function chooseInstantPeriodId(result: TaxonomyHydrationResult) {
  return instantPeriods[0]?.id ?? null;
 }
 function parseColumnLabelPeriodEnd(columnLabel: string) {
  const match = columnLabel.match(/^([A-Za-z]{3})\s+'?(\d{2,4})$/);
  if (!match) {
    return null;
  }
  const [, monthToken, yearToken] = match;
  const monthMap: Record<string, number> = {
    jan: 0,
    feb: 1,
    mar: 2,
    apr: 3,
    may: 4,
    jun: 5,
    jul: 6,
    aug: 7,
    sep: 8,
    oct: 9,
    nov: 10,
    dec: 11
  };
  const month = monthMap[monthToken.toLowerCase()];
  if (month === undefined) {
    return null;
  }
  const parsedYear = Number.parseInt(yearToken, 10);
  if (!Number.isFinite(parsedYear)) {
    return null;
  }
  const year = yearToken.length === 2 ? 2000 + parsedYear : parsedYear;
  return { month, year };
 }
 function choosePeriodIdForColumnLabel(
  result: TaxonomyHydrationResult,
  statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>,
  columnLabel: string
 ) {
  const parsed = parseColumnLabelPeriodEnd(columnLabel);
  if (!parsed) {
    return null;
  }
  const matchingPeriods = result.periods
    .filter((period): period is ResultPeriod => {
      const end = periodEnd(period as ResultPeriod);
      if (!end) {
        return false;
      }
      const endDate = new Date(end);
      if (Number.isNaN(endDate.getTime())) {
        return false;
      }
      const periodMatchesStatement = statement === 'balance'
        ? !periodStart(period as ResultPeriod)
        : Boolean(periodStart(period as ResultPeriod));
      if (!periodMatchesStatement) {
        return false;
      }
      return endDate.getUTCFullYear() === parsed.year && endDate.getUTCMonth() === parsed.month;
    })
    .sort((left, right) => {
      if (statement !== 'balance') {
        const leftStart = periodStart(left);
        const rightStart = periodStart(right);
        const leftDuration = leftStart
          ? Math.round((Date.parse(periodEnd(left) as string) - Date.parse(leftStart)) / (1000 * 60 * 60 * 24))
          : -1;
        const rightDuration = rightStart
          ? Math.round((Date.parse(periodEnd(right) as string) - Date.parse(rightStart)) / (1000 * 60 * 60 * 24))
          : -1;
        if (leftDuration !== rightDuration) {
          return rightDuration - leftDuration;
        }
      }
      return Date.parse(periodEnd(right) as string) - Date.parse(periodEnd(left) as string);
    });
  return matchingPeriods[0]?.id ?? null;
 }
 function findSurfaceValue(
  result: TaxonomyHydrationResult,
-  statement: Extract<FinancialStatementKind, 'income' | 'balance'>,
+  statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>,
-  surfaceKey: string
+  surfaceKey: string,
  referenceColumnLabel?: string
 ) {
  const rows = result.surface_rows[statement] ?? [];
  const row = rows.find((entry) => entry.key === surfaceKey) ?? null;
@@ -306,9 +444,11 @@ function findSurfaceValue(
    return { row: null, value: null };
  }
-  const periodId = statement === 'balance'
+  const periodId = (referenceColumnLabel
    ? choosePeriodIdForColumnLabel(result, statement, referenceColumnLabel)
    : null) ?? (statement === 'balance'
    ? chooseInstantPeriodId(result)
-    : chooseDurationPeriodId(result);
+    : chooseDurationPeriodId(result));
  if (periodId) {
    const directValue = row.values[periodId];
@@ -412,14 +552,24 @@ async function fetchLatestAnnualFiling(company: CompanyCase): Promise<TaxonomyHy
 async function scrapeFiscalAiTable(
  page: import('@playwright/test').Page,
  exchangeTicker: string,
-  statement: 'income' | 'balance'
+  statement: 'income' | 'balance' | 'cash_flow'
 ): Promise<FiscalAiTable> {
-  const pagePath = statement === 'income' ? 'income-statement' : 'balance-sheet';
+  const pagePath = statement === 'income'
    ? 'income-statement'
    : statement === 'balance'
      ? 'balance-sheet'
      : 'cash-flow-statement';
  const url = `https://fiscal.ai/company/${exchangeTicker}/financials/${pagePath}/annual/?templateType=standardized`;
  await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 120_000 });
  await page.waitForSelector('table', { timeout: 120_000 });
  await page.waitForTimeout(2_500);
  await page.evaluate(async () => {
    window.scrollTo(0, document.body.scrollHeight);
    await new Promise((resolve) => setTimeout(resolve, 750));
    window.scrollTo(0, 0);
    await new Promise((resolve) => setTimeout(resolve, 250));
  });
  return await page.evaluate(() => {
    function normalizeLabel(value: string) {
@@ -452,45 +602,52 @@ async function scrapeFiscalAiTable(
      return Number.isFinite(parsed) ? (negative ? -Math.abs(parsed) : parsed) : null;
    }
-    const table = document.querySelector('table');
+    const tables = Array.from(document.querySelectorAll('table'));
-    if (!table) {
+    if (tables.length === 0) {
      throw new Error('Fiscal.ai table not found');
    }
    const rowsByLabel = new Map<string, FiscalAiTableRow>();
    let columnLabel = 'unknown';
    for (const table of tables) {
      const headerCells = Array.from(table.querySelectorAll('tr:first-child th, tr:first-child td'))
        .map((cell) => cell.textContent?.trim() ?? '')
        .filter((value) => value.length > 0);
      const annualColumnIndex = headerCells.findIndex((value, index) => index > 0 && value !== 'LTM');
      if (annualColumnIndex < 0) {
-      throw new Error(`Could not locate latest annual column in headers: ${headerCells.join(' | ')}`);
+        continue;
      }
-    const rows = Array.from(table.querySelectorAll('tr'))
+      if (columnLabel === 'unknown') {
-      .slice(1)
+        columnLabel = headerCells[annualColumnIndex] ?? 'unknown';
-      .map((row) => {
+      }
      for (const row of Array.from(table.querySelectorAll('tr')).slice(1)) {
        const cells = Array.from(row.querySelectorAll('td'));
        if (cells.length <= annualColumnIndex) {
-          return null;
+          continue;
        }
        const label = cells[0]?.textContent?.trim() ?? '';
        const valueText = cells[annualColumnIndex]?.textContent?.trim() ?? '';
        if (!label) {
-          return null;
+          continue;
        }
-        return {
+        rowsByLabel.set(label, {
          label,
          normalizedLabel: normalizeLabel(label),
          valueText,
          value: parseDisplayedNumber(valueText)
-        };
+        });
-      })
+      }
-      .filter((entry): entry is FiscalAiTableRow => entry !== null);
+    }
    const rows = Array.from(rowsByLabel.values());
    return {
-      columnLabel: headerCells[annualColumnIndex] ?? 'unknown',
+      columnLabel,
      rows
    };
  });
@@ -536,7 +693,7 @@ function compareRow(
 ): ComparisonRow {
  const fiscalAiRow = findFiscalAiRow(fiscalAiTable.rows, target.fiscalAiLabels);
  const fiscalAiValueM = fiscalAiRow?.value ?? null;
-  const ourSurface = findSurfaceValue(result, target.statement, target.surfaceKey);
+  const ourSurface = findSurfaceValue(result, target.statement, target.surfaceKey, fiscalAiTable.columnLabel);
  const ourValueM = roundMillions(ourSurface.value);
  const absDiffM = absoluteDiff(ourValueM, fiscalAiValueM);
  const relDiffValue = relativeDiff(ourValueM, fiscalAiValueM);
@@ -587,17 +744,34 @@ async function compareCase(page: import('@playwright/test').Page, company: Compa
    throw new Error(`${company.ticker} parse_status=${result.parse_status}${result.parse_error ? ` parse_error=${result.parse_error}` : ''}`);
  }
-  const incomeTable = await scrapeFiscalAiTable(page, company.exchangeTicker, 'income');
+  const statementKinds = new Set(company.comparisons.map((target) => target.statement));
-  const balanceTable = await scrapeFiscalAiTable(page, company.exchangeTicker, 'balance');
+  const incomeTable = statementKinds.has('income')
    ? await scrapeFiscalAiTable(page, company.exchangeTicker, 'income')
    : null;
  const balanceTable = statementKinds.has('balance')
    ? await scrapeFiscalAiTable(page, company.exchangeTicker, 'balance')
    : null;
  const cashFlowTable = statementKinds.has('cash_flow')
    ? await scrapeFiscalAiTable(page, company.exchangeTicker, 'cash_flow')
    : null;
  const rows = company.comparisons.map((target) => {
-    const table = target.statement === 'income' ? incomeTable : balanceTable;
+    const table = target.statement === 'income'
      ? incomeTable
      : target.statement === 'balance'
        ? balanceTable
        : cashFlowTable;
    if (!table) {
      throw new Error(`Missing scraped table for ${target.statement}`);
    }
    return compareRow(target, result, table);
  });
-  const failures = rows.filter((row) => row.status === 'fail' || row.status === 'missing_ours');
+  const failures = rows.filter(
    (row) => row.status === 'fail' || row.status === 'missing_ours' || row.status === 'missing_reference'
  );
  console.log(
-    `[compare-fiscal-ai] ${company.ticker} filing=${filing.accessionNumber} fiscal_pack=${result.fiscal_pack ?? 'null'} income_column="${incomeTable.columnLabel}" balance_column="${balanceTable.columnLabel}" pass=${rows.length - failures.length}/${rows.length}`
+    `[compare-fiscal-ai] ${company.ticker} filing=${filing.accessionNumber} fiscal_pack=${result.fiscal_pack ?? 'null'} income_column="${incomeTable?.columnLabel ?? 'n/a'}" balance_column="${balanceTable?.columnLabel ?? 'n/a'}" cash_flow_column="${cashFlowTable?.columnLabel ?? 'n/a'}" pass=${rows.length - failures.length}/${rows.length}`
  );
  for (const row of rows) {
    console.log(
@@ -625,18 +799,28 @@ async function compareCase(page: import('@playwright/test').Page, company: Compa
 async function main() {
  process.env.XBRL_ENGINE_TIMEOUT_MS = process.env.XBRL_ENGINE_TIMEOUT_MS ?? '180000';
-  const tickerFilter = parseTickerFilter(process.argv.slice(2));
+  const filters = parseTickerFilter(process.argv.slice(2));
-  const selectedCases = tickerFilter
+  const selectedCases = (filters.ticker
-    ? CASES.filter((entry) => entry.ticker === tickerFilter)
+    ? CASES.filter((entry) => entry.ticker === filters.ticker)
-    : CASES;
+    : CASES
  )
    .map((entry) => ({
      ...entry,
      comparisons: filters.statement
        ? entry.comparisons.filter((target) => target.statement === filters.statement)
        : entry.comparisons
    }))
    .filter((entry) => entry.comparisons.length > 0);
  if (selectedCases.length === 0) {
-    console.error(`[compare-fiscal-ai] unknown ticker: ${tickerFilter}`);
+    console.error(
      `[compare-fiscal-ai] no matching cases for ticker=${filters.ticker ?? 'all'} statement=${filters.statement ?? 'all'}`
    );
    process.exitCode = 1;
    return;
  }
-  const browser = await chromium.launch({ headless: false });
+  const browser = await chromium.launch({ headless: true });
  const page = await browser.newPage({
    userAgent: BROWSER_USER_AGENT
  });