Remove legacy TypeScript financial surface mapping, make Rust JSON single source of truth

- Delete standard-template.ts, surface.ts, materialize.ts (dead code)
- Delete financial-taxonomy.test.ts (relied on removed code)
- Add missing income statement surfaces to core.surface.json
- Add cost_of_revenue mapping to core.income-bridge.json
- Refactor standardize.ts to remove template dependency
- Simplify financial-taxonomy.ts to use only DB snapshots
- Add architecture documentation
This commit is contained in:
2026-03-15 14:38:48 -04:00
parent 7a42d73a48
commit a7f7be50b4
9 changed files with 574 additions and 5009 deletions

View File

@@ -0,0 +1,145 @@
# Financial Surface Definitions Architecture
## Overview
As of Issue #26, the financial statement mapping architecture follows a **Rust-first approach** where the Rust sidecar is the authoritative source for surface definitions.
**All legacy TypeScript template code has been removed.**
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ SEC EDGAR Filing │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Rust Sidecar (fiscal-xbrl) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ rust/taxonomy/fiscal/v1/core.surface.json │ │
│ │ rust/taxonomy/fiscal/v1/core.income-bridge.json │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ surface_mapper.rs - builds surface_rows │ │
│ │ kpi_mapper.rs - builds kpi_rows │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ SQLite Database │
│ filing_taxonomy_snapshot.surface_rows │
│ filing_taxonomy_snapshot.detail_rows │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ TypeScript Layer │
│ financial-taxonomy.ts:aggregateSurfaceRows() │
│ - Reads surface_rows from DB snapshots │
│ - Aggregates across selected periods │
│ - Returns to frontend for display │
└─────────────────────────────────────────────────────────────────┘
```
## Source of Truth
### Authoritative Sources (Edit These)
1. **`rust/taxonomy/fiscal/v1/core.surface.json`**
- Defines all surface keys, labels, categories, orders, and formulas
- Example: `revenue`, `cost_of_revenue`, `gross_profit`, `net_income`
2. **`rust/taxonomy/fiscal/v1/core.income-bridge.json`**
- Maps XBRL concepts to income statement surfaces
- Defines component surfaces for formula derivation
### Removed Files (Do NOT Recreate)
The following files have been **permanently removed**:
1. ~~`lib/server/financials/standard-template.ts`~~ - Template definitions (now in Rust JSON)
2. ~~`lib/server/financials/surface.ts`~~ - Fallback surface builder (no longer needed)
3. ~~`lib/server/financials/standardize.ts`~~ - Template-based row builder (replaced by Rust)
### Remaining TypeScript Helpers
`lib/server/financials/standardize.ts` (simplified version) contains only:
- `buildLtmStandardizedRows` - Computes LTM values from quarterly data
- `buildDimensionBreakdown` - Builds dimension breakdowns from facts
These operate on already-mapped surface data from the Rust sidecar.
2. **`lib/server/financials/standardize.ts`**
- Contains `buildStandardizedRows` - kept for fallback/testing only
- Marked as `@deprecated`
## How to Add a New Surface
1. **Add to `rust/taxonomy/fiscal/v1/core.surface.json`**:
```json
{
"surface_key": "new_metric",
"statement": "income",
"label": "New Metric",
"category": "surface",
"order": 100,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": ["us-gaap:NewMetricConcept"],
"allowed_authoritative_concepts": ["us-gaap:NewMetricConcept"],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
}
```
2. **Add concept mapping to `core.income-bridge.json`** (if needed):
```json
"new_metric": {
"direct_authoritative_concepts": ["us-gaap:NewMetricConcept"],
"direct_source_concepts": ["NewMetricConcept"],
"component_surfaces": { "positive": [], "negative": [] },
"component_concept_groups": { "positive": [], "negative": [] },
"formula": "direct",
"not_meaningful_for_pack": false,
"warning_codes_when_used": []
}
```
3. **Rebuild the Rust sidecar**:
```bash
cd rust && cargo build --release
```
4. **Re-ingest filings** to populate the new surface
## Key Surfaces
### Income Statement
| Key | Order | Description |
|-----|------|-------------|
| `revenue` | 10 | Top-line revenue |
| `cost_of_revenue` | 20 | Cost of revenue/COGS |
| `gross_profit` | 30 | Revenue - Cost of Revenue |
| `gross_margin` | 35 | Gross Profit / Revenue (percent) |
| `operating_expenses` | 40 | Total operating expenses |
| `operating_income` | 60 | Gross Profit - Operating Expenses |
| `operating_margin` | 65 | Operating Income / Revenue (percent) |
| `pretax_income` | 80 | Income before taxes |
| `income_tax_expense` | 85 | Income tax provision |
| `effective_tax_rate` | 87 | Tax Expense / Pretax Income (percent) |
| `ebitda` | 88 | Operating Income + D&A |
| `net_income` | 90 | Bottom-line net income |
| `diluted_eps` | 100 | Diluted earnings per share |
| `basic_eps` | 105 | Basic earnings per share |
| `diluted_shares` | 110 | Weighted avg diluted shares |
| `basic_shares` | 115 | Weighted avg basic shares |
### Balance Sheet
See `rust/taxonomy/fiscal/v1/core.surface.json` for complete list.
### Cash Flow Statement
See `rust/taxonomy/fiscal/v1/core.surface.json` for complete list.
## Related Files
- `rust/fiscal-xbrl-core/src/surface_mapper.rs` - Surface resolution logic
- `rust/fiscal-xbrl-core/src/taxonomy_loader.rs` - JSON loading
- `lib/server/repos/filing-taxonomy.ts` - DB operations
- `lib/server/financial-taxonomy.ts` - Main entry point

File diff suppressed because it is too large Load Diff

View File

@@ -36,8 +36,7 @@ import {
} from '@/lib/server/financials/bundles';
import {
buildDimensionBreakdown,
buildLtmStandardizedRows,
buildStandardizedRows
buildLtmStandardizedRows
} from '@/lib/server/financials/standardize';
import { buildRatioRows } from '@/lib/server/financials/ratios';
import { buildFinancialCategories, buildTrendSeries } from '@/lib/server/financials/trend-series';
@@ -620,16 +619,7 @@ function buildQuarterlyStatementSurfaceRows(input: {
selectedPeriodIds: input.selectedPeriodIds
});
if (aggregatedRows.length > 0) {
return aggregatedRows;
}
return buildStandardizedRows({
rows: input.faithfulRows,
statement: input.statement,
periods: input.sourcePeriods,
facts: input.facts
}) as SurfaceFinancialRow[];
return aggregatedRows;
}
function aggregatePersistedKpiRows(input: {
@@ -1303,7 +1293,6 @@ export async function getCompanyFinancialTaxonomy(input: GetCompanyFinancialsInp
export const __financialTaxonomyInternals = {
buildRows,
buildStandardizedRows,
buildDimensionBreakdown,
buildNormalizationMetadata,
aggregateSurfaceRows,

File diff suppressed because it is too large Load Diff

View File

@@ -1,31 +1,11 @@
import type {
DerivedFinancialRow,
DimensionBreakdownRow,
FinancialStatementKind,
FinancialStatementPeriod,
FinancialUnit,
StandardizedFinancialRow,
TaxonomyFactRow,
TaxonomyStatementRow
} from '@/lib/types';
import {
STANDARD_FINANCIAL_TEMPLATES,
type StandardTemplateRowDefinition,
type TemplateFormula
} from '@/lib/server/financials/standard-template';
function normalizeToken(value: string) {
return value.trim().toLowerCase();
}
function tokenizeLabel(value: string) {
return value
.toLowerCase()
.replace(/[^a-z0-9]+/g, ' ')
.trim()
.split(/\s+/)
.filter((token) => token.length > 0);
}
function valueOrNull(values: Record<string, number | null>, periodId: string) {
return periodId in values ? values[periodId] : null;
@@ -39,326 +19,6 @@ function sumValues(values: Array<number | null>, treatNullAsZero = false) {
return values.reduce<number>((sum, value) => sum + (value ?? 0), 0);
}
function subtractValues(left: number | null, right: number | null) {
if (left === null || right === null) {
return null;
}
return left - right;
}
function divideValues(left: number | null, right: number | null) {
if (left === null || right === null || right === 0) {
return null;
}
return left / right;
}
type CandidateMatchKind = 'exact_local_name' | 'secondary_local_name' | 'label_phrase';
type StatementRowCandidate = {
row: TaxonomyStatementRow;
matchKind: CandidateMatchKind;
aliasRank: number;
unit: FinancialUnit;
labelTokenCount: number;
matchedPhraseTokenCount: number;
};
type FactCandidate = {
fact: TaxonomyFactRow;
matchKind: Exclude<CandidateMatchKind, 'label_phrase'>;
aliasRank: number;
unit: FinancialUnit;
};
type ResolvedCandidate =
| {
sourceType: 'row';
matchKind: CandidateMatchKind;
aliasRank: number;
unit: FinancialUnit;
labelTokenCount: number;
matchedPhraseTokenCount: number;
row: TaxonomyStatementRow;
}
| {
sourceType: 'fact';
matchKind: Exclude<CandidateMatchKind, 'label_phrase'>;
aliasRank: number;
unit: FinancialUnit;
fact: TaxonomyFactRow;
};
type DerivedRole = 'expense' | 'addback';
type InternalRowMetadata = {
derivedRoleByPeriod: Record<string, DerivedRole | null>;
};
function resolvedCandidatesForPeriod(input: {
definition: StandardTemplateRowDefinition;
candidates: StatementRowCandidate[];
factCandidates: FactCandidate[];
period: FinancialStatementPeriod;
}) {
const rowCandidates = input.candidates
.filter((candidate) => input.period.id in candidate.row.values && candidate.row.values[input.period.id] !== null)
.map((candidate) => ({
sourceType: 'row' as const,
...candidate
}));
const factCandidates = input.factCandidates
.filter((candidate) => factMatchesPeriod(candidate.fact, input.period))
.map((candidate) => ({
sourceType: 'fact' as const,
...candidate
}));
if (input.definition.selectionPolicy === 'aggregate_multiple_components') {
const aggregateCandidates = [...rowCandidates, ...factCandidates]
.sort((left, right) => compareResolvedCandidates(left, right, input.definition));
const dedupedCandidates: ResolvedCandidate[] = [];
const seenConcepts = new Set<string>();
for (const candidate of aggregateCandidates) {
const conceptKey = candidate.sourceType === 'row'
? candidate.row.key
: candidate.fact.conceptKey;
if (seenConcepts.has(conceptKey)) {
continue;
}
seenConcepts.add(conceptKey);
dedupedCandidates.push(candidate);
}
return dedupedCandidates;
}
const resolvedCandidate = [...rowCandidates, ...factCandidates]
.sort((left, right) => compareResolvedCandidates(left, right, input.definition))[0];
return resolvedCandidate ? [resolvedCandidate] : [];
}
const GLOBAL_EXCLUDE_LABEL_PHRASES = [
'pro forma',
'reconciliation',
'acquiree',
'business combination',
'assets acquired',
'liabilities assumed'
] as const;
function inferUnit(rawUnit: string | null, fallback: FinancialUnit) {
const normalized = (rawUnit ?? '').toLowerCase();
if (!normalized) {
return fallback;
}
if (normalized.includes('usd') || normalized.includes('iso4217')) {
return 'currency';
}
if (normalized.includes('shares')) {
return 'shares';
}
if (normalized.includes('pure') || normalized.includes('percent')) {
return fallback === 'percent' ? 'percent' : 'ratio';
}
return fallback;
}
function rowUnit(row: TaxonomyStatementRow, fallback: FinancialUnit) {
return inferUnit(Object.values(row.units)[0] ?? null, fallback);
}
function isUnitCompatible(expected: FinancialUnit, actual: FinancialUnit) {
if (expected === actual) {
return true;
}
if ((expected === 'percent' || expected === 'ratio') && (actual === 'percent' || actual === 'ratio')) {
return true;
}
return false;
}
function phraseTokens(phrase: string) {
return tokenizeLabel(phrase);
}
function labelContainsPhrase(labelTokens: string[], phrase: string) {
const target = phraseTokens(phrase);
if (target.length === 0 || target.length > labelTokens.length) {
return false;
}
for (let index = 0; index <= labelTokens.length - target.length; index += 1) {
let matched = true;
for (let offset = 0; offset < target.length; offset += 1) {
if (labelTokens[index + offset] !== target[offset]) {
matched = false;
break;
}
}
if (matched) {
return true;
}
}
return false;
}
function matchRank(matchKind: CandidateMatchKind) {
switch (matchKind) {
case 'exact_local_name':
return 0;
case 'secondary_local_name':
return 1;
case 'label_phrase':
return 2;
}
}
function aliasRank(localName: string, aliases: readonly string[] | undefined) {
const normalizedLocalName = normalizeToken(localName);
const matchIndex = (aliases ?? []).findIndex((alias) => normalizeToken(alias) === normalizedLocalName);
return matchIndex === -1 ? Number.MAX_SAFE_INTEGER : matchIndex;
}
function applySignTransform(value: number | null, transform: StandardTemplateRowDefinition['signTransform']) {
if (value === null || !transform) {
return value;
}
if (transform === 'invert') {
return value * -1;
}
return Math.abs(value);
}
function classifyStatementRowCandidate(
row: TaxonomyStatementRow,
definition: StandardTemplateRowDefinition
) {
if (definition.selectionPolicy === 'formula_only') {
return null;
}
const rowLocalName = normalizeToken(row.localName);
if ((definition.matchers.excludeLocalNames ?? []).some((localName) => normalizeToken(localName) === rowLocalName)) {
return null;
}
const labelTokens = tokenizeLabel(row.label);
const excludedLabelPhrases = [
...GLOBAL_EXCLUDE_LABEL_PHRASES,
...(definition.matchers.excludeLabelPhrases ?? [])
];
if (excludedLabelPhrases.some((phrase) => labelContainsPhrase(labelTokens, phrase))) {
return null;
}
const unit = rowUnit(row, definition.unit);
if (!isUnitCompatible(definition.unit, unit)) {
return null;
}
if ((definition.matchers.exactLocalNames ?? []).some((localName) => normalizeToken(localName) === rowLocalName)) {
return {
row,
matchKind: 'exact_local_name',
aliasRank: aliasRank(row.localName, definition.matchers.exactLocalNames),
unit,
labelTokenCount: labelTokens.length,
matchedPhraseTokenCount: 0
} satisfies StatementRowCandidate;
}
if ((definition.matchers.secondaryLocalNames ?? []).some((localName) => normalizeToken(localName) === rowLocalName)) {
return {
row,
matchKind: 'secondary_local_name',
aliasRank: aliasRank(row.localName, definition.matchers.secondaryLocalNames),
unit,
labelTokenCount: labelTokens.length,
matchedPhraseTokenCount: 0
} satisfies StatementRowCandidate;
}
const matchedPhrase = (definition.matchers.allowedLabelPhrases ?? [])
.map((phrase) => ({
phrase,
tokenCount: phraseTokens(phrase).length
}))
.filter(({ phrase }) => labelContainsPhrase(labelTokens, phrase))
.sort((left, right) => right.tokenCount - left.tokenCount)[0];
if (!matchedPhrase) {
return null;
}
if (row.hasDimensions) {
return null;
}
return {
row,
matchKind: 'label_phrase',
aliasRank: Number.MAX_SAFE_INTEGER,
unit,
labelTokenCount: labelTokens.length,
matchedPhraseTokenCount: matchedPhrase.tokenCount
} satisfies StatementRowCandidate;
}
function classifyFactCandidate(
fact: TaxonomyFactRow,
definition: StandardTemplateRowDefinition
) {
if (!fact.isDimensionless) {
return null;
}
const localName = normalizeToken(fact.localName);
if ((definition.matchers.excludeLocalNames ?? []).some((entry) => normalizeToken(entry) === localName)) {
return null;
}
const unit = inferUnit(fact.unit ?? null, definition.unit);
if (!isUnitCompatible(definition.unit, unit)) {
return null;
}
if ((definition.matchers.exactLocalNames ?? []).some((entry) => normalizeToken(entry) === localName)) {
return {
fact,
matchKind: 'exact_local_name',
aliasRank: aliasRank(fact.localName, definition.matchers.exactLocalNames),
unit
} satisfies FactCandidate;
}
if ((definition.matchers.secondaryLocalNames ?? []).some((entry) => normalizeToken(entry) === localName)) {
return {
fact,
matchKind: 'secondary_local_name',
aliasRank: aliasRank(fact.localName, definition.matchers.secondaryLocalNames),
unit
} satisfies FactCandidate;
}
return null;
}
export function factMatchesPeriod(fact: TaxonomyFactRow, period: FinancialStatementPeriod) {
if (period.periodStart) {
return fact.periodStart === period.periodStart && fact.periodEnd === period.periodEnd;
@@ -367,390 +27,6 @@ export function factMatchesPeriod(fact: TaxonomyFactRow, period: FinancialStatem
return (fact.periodInstant ?? fact.periodEnd) === period.periodEnd;
}
function compareStatementRowCandidates(
left: StatementRowCandidate,
right: StatementRowCandidate,
definition: StandardTemplateRowDefinition
) {
const matchDelta = matchRank(left.matchKind) - matchRank(right.matchKind);
if (matchDelta !== 0) {
return matchDelta;
}
if (left.aliasRank !== right.aliasRank) {
return left.aliasRank - right.aliasRank;
}
if (left.row.hasDimensions !== right.row.hasDimensions) {
return left.row.hasDimensions ? 1 : -1;
}
if (definition.selectionPolicy === 'prefer_primary_statement_concept' && left.row.isExtension !== right.row.isExtension) {
return left.row.isExtension ? 1 : -1;
}
if (left.row.order !== right.row.order) {
return left.row.order - right.row.order;
}
if (left.matchedPhraseTokenCount !== right.matchedPhraseTokenCount) {
return right.matchedPhraseTokenCount - left.matchedPhraseTokenCount;
}
if (left.labelTokenCount !== right.labelTokenCount) {
return left.labelTokenCount - right.labelTokenCount;
}
return left.row.label.localeCompare(right.row.label);
}
function compareFactCandidates(left: FactCandidate, right: FactCandidate) {
const matchDelta = matchRank(left.matchKind) - matchRank(right.matchKind);
if (matchDelta !== 0) {
return matchDelta;
}
if (left.aliasRank !== right.aliasRank) {
return left.aliasRank - right.aliasRank;
}
return left.fact.qname.localeCompare(right.fact.qname);
}
function compareResolvedCandidates(
left: ResolvedCandidate,
right: ResolvedCandidate,
definition: StandardTemplateRowDefinition
) {
const matchDelta = matchRank(left.matchKind) - matchRank(right.matchKind);
if (matchDelta !== 0) {
return matchDelta;
}
if (left.aliasRank !== right.aliasRank) {
return left.aliasRank - right.aliasRank;
}
if (left.sourceType === 'row' && right.sourceType === 'row') {
return compareStatementRowCandidates(left, right, definition);
}
if (left.sourceType === 'fact' && right.sourceType === 'fact') {
return compareFactCandidates(left, right);
}
if (left.sourceType === 'row' && right.sourceType === 'fact') {
return left.row.hasDimensions ? 1 : -1;
}
if (left.sourceType === 'fact' && right.sourceType === 'row') {
return right.row.hasDimensions ? -1 : 1;
}
return 0;
}
function buildTemplateRow(
definition: StandardTemplateRowDefinition,
candidates: StatementRowCandidate[],
factCandidates: FactCandidate[],
periods: FinancialStatementPeriod[]
) {
const sourceConcepts = new Set<string>();
const sourceRowKeys = new Set<string>();
const sourceFactIds = new Set<number>();
const matchedRowKeys = new Set<string>();
const values: Record<string, number | null> = Object.fromEntries(periods.map((period) => [period.id, null]));
const resolvedSourceRowKeys: Record<string, string | null> = Object.fromEntries(periods.map((period) => [period.id, null]));
const metadata: InternalRowMetadata = {
derivedRoleByPeriod: Object.fromEntries(periods.map((period) => [period.id, null]))
};
let unit = definition.unit;
let hasDimensions = false;
for (const period of periods) {
const resolvedCandidates = resolvedCandidatesForPeriod({
definition,
candidates,
factCandidates,
period
});
if (resolvedCandidates.length === 0) {
continue;
}
if (definition.key === 'depreciation_and_amortization') {
metadata.derivedRoleByPeriod[period.id] = resolvedCandidates.some((candidate) => {
const localName = candidate.sourceType === 'row'
? candidate.row.localName
: candidate.fact.localName;
return normalizeToken(localName) === normalizeToken('CostOfGoodsAndServicesSoldDepreciationAndAmortization');
})
? 'expense'
: 'addback';
}
values[period.id] = definition.selectionPolicy === 'aggregate_multiple_components'
? sumValues(resolvedCandidates.map((candidate) => {
if (candidate.sourceType === 'row') {
return applySignTransform(candidate.row.values[period.id] ?? null, definition.signTransform);
}
return applySignTransform(candidate.fact.value ?? null, definition.signTransform);
}))
: (() => {
const resolvedCandidate = resolvedCandidates[0]!;
if (resolvedCandidate.sourceType === 'row') {
return applySignTransform(resolvedCandidate.row.values[period.id] ?? null, definition.signTransform);
}
return applySignTransform(resolvedCandidate.fact.value ?? null, definition.signTransform);
})();
resolvedSourceRowKeys[period.id] = resolvedCandidates.length === 1
? (resolvedCandidates[0]!.sourceType === 'row'
? resolvedCandidates[0]!.row.key
: resolvedCandidates[0]!.fact.conceptKey ?? null)
: null;
for (const resolvedCandidate of resolvedCandidates) {
unit = resolvedCandidate.unit;
if (resolvedCandidate.sourceType === 'row') {
hasDimensions = hasDimensions || resolvedCandidate.row.hasDimensions;
matchedRowKeys.add(resolvedCandidate.row.key);
sourceConcepts.add(resolvedCandidate.row.qname);
sourceRowKeys.add(resolvedCandidate.row.key);
for (const factId of resolvedCandidate.row.sourceFactIds) {
sourceFactIds.add(factId);
}
continue;
}
sourceConcepts.add(resolvedCandidate.fact.qname);
sourceRowKeys.add(resolvedCandidate.fact.conceptKey);
sourceFactIds.add(resolvedCandidate.fact.id);
}
}
return {
row: {
key: definition.key,
label: definition.label,
category: definition.category,
templateSection: definition.category,
order: definition.order,
unit,
values,
sourceConcepts: [...sourceConcepts].sort((left, right) => left.localeCompare(right)),
sourceRowKeys: [...sourceRowKeys].sort((left, right) => left.localeCompare(right)),
sourceFactIds: [...sourceFactIds].sort((left, right) => left - right),
formulaKey: null,
hasDimensions,
resolvedSourceRowKeys
} satisfies StandardizedFinancialRow,
matchedRowKeys,
metadata
};
}
function computeFormulaValue(
formula: TemplateFormula,
rowsByKey: Map<string, StandardizedFinancialRow>,
periodId: string
) {
switch (formula.kind) {
case 'sum':
return sumValues(
formula.sourceKeys.map((key) => valueOrNull(rowsByKey.get(key)?.values ?? {}, periodId)),
formula.treatNullAsZero ?? false
);
case 'subtract':
return subtractValues(
valueOrNull(rowsByKey.get(formula.left)?.values ?? {}, periodId),
valueOrNull(rowsByKey.get(formula.right)?.values ?? {}, periodId)
);
case 'divide':
return divideValues(
valueOrNull(rowsByKey.get(formula.numerator)?.values ?? {}, periodId),
valueOrNull(rowsByKey.get(formula.denominator)?.values ?? {}, periodId)
);
}
}
function rowValueForPeriod(
rowsByKey: Map<string, StandardizedFinancialRow>,
key: string,
periodId: string
) {
return valueOrNull(rowsByKey.get(key)?.values ?? {}, periodId);
}
function computeOperatingIncomeFallbackValue(
rowsByKey: Map<string, StandardizedFinancialRow>,
rowMetadataByKey: Map<string, InternalRowMetadata>,
periodId: string
) {
const grossProfit = rowValueForPeriod(rowsByKey, 'gross_profit', periodId);
const sellingGeneralAndAdministrative = rowValueForPeriod(rowsByKey, 'selling_general_and_administrative', periodId);
const researchAndDevelopment = rowValueForPeriod(rowsByKey, 'research_and_development', periodId) ?? 0;
const depreciationAndAmortization = rowValueForPeriod(rowsByKey, 'depreciation_and_amortization', periodId);
const depreciationRole = rowMetadataByKey.get('depreciation_and_amortization')?.derivedRoleByPeriod[periodId] ?? null;
if (
depreciationRole === 'expense'
&& grossProfit !== null
&& sellingGeneralAndAdministrative !== null
&& depreciationAndAmortization !== null
) {
return grossProfit - sellingGeneralAndAdministrative - researchAndDevelopment - depreciationAndAmortization;
}
const pretaxIncome = rowValueForPeriod(rowsByKey, 'pretax_income', periodId);
if (pretaxIncome === null) {
return null;
}
const interestExpense = rowValueForPeriod(rowsByKey, 'interest_expense', periodId) ?? 0;
const interestIncome = rowValueForPeriod(rowsByKey, 'interest_income', periodId) ?? 0;
const otherNonOperatingIncome = rowValueForPeriod(rowsByKey, 'other_non_operating_income', periodId) ?? 0;
return pretaxIncome + interestExpense - interestIncome - otherNonOperatingIncome;
}
function computeFallbackValueForDefinition(
definition: StandardTemplateRowDefinition,
rowsByKey: Map<string, StandardizedFinancialRow>,
rowMetadataByKey: Map<string, InternalRowMetadata>,
periodId: string
) {
if (definition.key === 'operating_income') {
return computeOperatingIncomeFallbackValue(rowsByKey, rowMetadataByKey, periodId);
}
if (!definition.fallbackFormula) {
return null;
}
return computeFormulaValue(definition.fallbackFormula, rowsByKey, periodId);
}
function applyFormulas(
rowsByKey: Map<string, StandardizedFinancialRow>,
rowMetadataByKey: Map<string, InternalRowMetadata>,
definitions: StandardTemplateRowDefinition[],
periods: FinancialStatementPeriod[]
) {
for (let pass = 0; pass < definitions.length; pass += 1) {
let changed = false;
for (const definition of definitions) {
if (!definition.fallbackFormula && definition.key !== 'operating_income') {
continue;
}
const target = rowsByKey.get(definition.key);
if (!target) {
continue;
}
let usedFormula = target.formulaKey !== null;
for (const period of periods) {
if (definition.selectionPolicy !== 'formula_only' && target.values[period.id] !== null) {
continue;
}
const computed = computeFallbackValueForDefinition(definition, rowsByKey, rowMetadataByKey, period.id);
if (computed === null) {
continue;
}
target.values[period.id] = applySignTransform(computed, definition.signTransform);
target.resolvedSourceRowKeys[period.id] = null;
usedFormula = true;
changed = true;
}
if (usedFormula) {
target.formulaKey = definition.key;
}
}
if (!changed) {
break;
}
}
}
export function buildStandardizedRows(input: {
rows: TaxonomyStatementRow[];
statement: Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>;
periods: FinancialStatementPeriod[];
facts: TaxonomyFactRow[];
}) {
const definitions = STANDARD_FINANCIAL_TEMPLATES[input.statement];
const rowsByKey = new Map<string, StandardizedFinancialRow>();
const rowMetadataByKey = new Map<string, InternalRowMetadata>();
const matchedRowKeys = new Set<string>();
for (const definition of definitions) {
const candidates = input.rows
.map((row) => classifyStatementRowCandidate(row, definition))
.filter((candidate): candidate is StatementRowCandidate => candidate !== null);
const factCandidates = input.facts
.map((fact) => classifyFactCandidate(fact, definition))
.filter((candidate): candidate is FactCandidate => candidate !== null);
const templateRow = buildTemplateRow(definition, candidates, factCandidates, input.periods);
for (const rowKey of templateRow.matchedRowKeys) {
matchedRowKeys.add(rowKey);
}
const hasAnyValue = Object.values(templateRow.row.values).some((value) => value !== null);
if (hasAnyValue || definition.fallbackFormula || definition.key === 'operating_income') {
rowsByKey.set(definition.key, templateRow.row);
rowMetadataByKey.set(definition.key, templateRow.metadata);
}
}
applyFormulas(rowsByKey, rowMetadataByKey, definitions, input.periods);
const templateRows = definitions
.filter((definition) => definition.includeInOutput !== false)
.map((definition) => rowsByKey.get(definition.key))
.filter((row): row is StandardizedFinancialRow => row !== undefined);
const coveredTemplateSourceRowKeys = new Set(templateRows.flatMap((row) => row.sourceRowKeys));
const unmatchedRows = input.rows
.filter((row) => !matchedRowKeys.has(row.key))
.filter((row) => !(row.hasDimensions && coveredTemplateSourceRowKeys.has(row.key)))
.map((row) => ({
key: `other:${row.key}`,
label: row.label,
category: 'other',
templateSection: 'other',
order: 10_000 + row.order,
unit: inferUnit(Object.values(row.units)[0] ?? null, 'currency'),
values: { ...row.values },
sourceConcepts: [row.qname],
sourceRowKeys: [row.key],
sourceFactIds: [...row.sourceFactIds],
formulaKey: null,
hasDimensions: row.hasDimensions,
resolvedSourceRowKeys: Object.fromEntries(
input.periods.map((period) => [period.id, period.id in row.values ? row.key : null])
)
} satisfies StandardizedFinancialRow));
return [...templateRows, ...unmatchedRows].sort((left, right) => {
if (left.order !== right.order) {
return left.order - right.order;
}
return left.label.localeCompare(right.label);
});
}
export function buildDimensionBreakdown(
facts: TaxonomyFactRow[],
periods: FinancialStatementPeriod[],

View File

@@ -1,320 +0,0 @@
import type {
DetailFinancialRow,
FinancialStatementKind,
FinancialStatementPeriod,
NormalizationSummary,
StructuredKpiRow,
SurfaceDetailMap,
SurfaceFinancialRow,
TaxonomyFactRow,
TaxonomyStatementRow
} from '@/lib/types';
import { buildStandardizedRows } from '@/lib/server/financials/standardize';
type CompactStatement = Extract<FinancialStatementKind, 'income' | 'balance' | 'cash_flow'>;
type SurfaceDefinition = {
key: string;
label: string;
category: string;
order: number;
unit: SurfaceFinancialRow['unit'];
rowKey?: string;
componentKeys?: string[];
formula?: {
kind: 'subtract';
left: string;
right: string;
};
};
const EMPTY_SURFACE_ROWS: Record<FinancialStatementKind, SurfaceFinancialRow[]> = {
income: [],
balance: [],
cash_flow: [],
equity: [],
comprehensive_income: []
};
const EMPTY_DETAIL_ROWS: Record<FinancialStatementKind, SurfaceDetailMap> = {
income: {},
balance: {},
cash_flow: {},
equity: {},
comprehensive_income: {}
};
const SURFACE_DEFINITIONS: Record<CompactStatement, SurfaceDefinition[]> = {
income: [
{ key: 'revenue', label: 'Revenue', category: 'surface', order: 10, unit: 'currency', rowKey: 'revenue' },
{ key: 'cost_of_revenue', label: 'Cost of Revenue', category: 'surface', order: 20, unit: 'currency', rowKey: 'cost_of_revenue' },
{ key: 'gross_profit', label: 'Gross Profit', category: 'surface', order: 30, unit: 'currency', rowKey: 'gross_profit' },
{
key: 'operating_expenses',
label: 'Operating Expenses',
category: 'surface',
order: 40,
unit: 'currency',
componentKeys: ['selling_general_and_administrative', 'research_and_development', 'depreciation_and_amortization']
},
{ key: 'operating_income', label: 'Operating Income', category: 'surface', order: 50, unit: 'currency', rowKey: 'operating_income' },
{
key: 'interest_and_other',
label: 'Interest and Other',
category: 'surface',
order: 60,
unit: 'currency',
formula: {
kind: 'subtract',
left: 'pretax_income',
right: 'operating_income'
}
},
{ key: 'pretax_income', label: 'Pretax Income', category: 'surface', order: 70, unit: 'currency', rowKey: 'pretax_income' },
{ key: 'income_taxes', label: 'Income Taxes', category: 'surface', order: 80, unit: 'currency', rowKey: 'income_tax_expense' },
{ key: 'net_income', label: 'Net Income', category: 'surface', order: 90, unit: 'currency', rowKey: 'net_income' }
],
balance: [
{ key: 'cash_and_equivalents', label: 'Cash and Equivalents', category: 'surface', order: 10, unit: 'currency', rowKey: 'cash_and_equivalents' },
{ key: 'receivables', label: 'Receivables', category: 'surface', order: 20, unit: 'currency', rowKey: 'accounts_receivable' },
{ key: 'inventory', label: 'Inventory', category: 'surface', order: 30, unit: 'currency', rowKey: 'inventory' },
{ key: 'current_assets', label: 'Current Assets', category: 'surface', order: 40, unit: 'currency', rowKey: 'current_assets' },
{ key: 'ppe', label: 'Property, Plant & Equipment', category: 'surface', order: 50, unit: 'currency', rowKey: 'property_plant_equipment' },
{
key: 'goodwill_and_intangibles',
label: 'Goodwill and Intangibles',
category: 'surface',
order: 60,
unit: 'currency',
componentKeys: ['goodwill', 'intangible_assets']
},
{ key: 'total_assets', label: 'Total Assets', category: 'surface', order: 70, unit: 'currency', rowKey: 'total_assets' },
{ key: 'current_liabilities', label: 'Current Liabilities', category: 'surface', order: 80, unit: 'currency', rowKey: 'current_liabilities' },
{ key: 'debt', label: 'Debt', category: 'surface', order: 90, unit: 'currency', rowKey: 'total_debt' },
{ key: 'total_liabilities', label: 'Total Liabilities', category: 'surface', order: 100, unit: 'currency', rowKey: 'total_liabilities' },
{ key: 'shareholders_equity', label: 'Shareholders Equity', category: 'surface', order: 110, unit: 'currency', rowKey: 'total_equity' }
],
cash_flow: [
{ key: 'operating_cash_flow', label: 'Operating Cash Flow', category: 'surface', order: 10, unit: 'currency', rowKey: 'operating_cash_flow' },
{ key: 'capital_expenditures', label: 'Capital Expenditures', category: 'surface', order: 20, unit: 'currency', rowKey: 'capital_expenditures' },
{ key: 'acquisitions', label: 'Acquisitions', category: 'surface', order: 30, unit: 'currency', rowKey: 'acquisitions' },
{ key: 'investing_cash_flow', label: 'Investing Cash Flow', category: 'surface', order: 40, unit: 'currency', rowKey: 'investing_cash_flow' },
{ key: 'financing_cash_flow', label: 'Financing Cash Flow', category: 'surface', order: 50, unit: 'currency', rowKey: 'financing_cash_flow' },
{ key: 'free_cash_flow', label: 'Free Cash Flow', category: 'surface', order: 60, unit: 'currency', rowKey: 'free_cash_flow' }
]
};
function rowHasAnyValue(row: { values: Record<string, number | null> }) {
return Object.values(row.values).some((value) => value !== null);
}
function sumValues(values: Array<number | null>) {
if (values.every((value) => value === null)) {
return null;
}
return values.reduce<number>((sum, value) => sum + (value ?? 0), 0);
}
function valueForPeriod(
rowByKey: Map<string, SurfaceFinancialRow>,
rowKey: string,
periodId: string
) {
return rowByKey.get(rowKey)?.values[periodId] ?? null;
}
function maxAbsValue(values: Record<string, number | null>) {
return Object.values(values).reduce<number>((max, value) => Math.max(max, Math.abs(value ?? 0)), 0);
}
function detailUnit(row: SurfaceFinancialRow, faithfulRow: TaxonomyStatementRow | undefined) {
if (faithfulRow) {
return Object.values(faithfulRow.units)[0] ?? null;
}
switch (row.unit) {
case 'currency':
return 'USD';
case 'shares':
return 'shares';
case 'percent':
return 'pure';
default:
return null;
}
}
function buildDetailRow(input: {
row: SurfaceFinancialRow;
parentSurfaceKey: string;
faithfulRowByKey: Map<string, TaxonomyStatementRow>;
}): DetailFinancialRow {
const sourceRowKey = input.row.sourceRowKeys.find((key) => input.faithfulRowByKey.has(key)) ?? input.row.sourceRowKeys[0] ?? input.row.key;
const faithfulRow = sourceRowKey ? input.faithfulRowByKey.get(sourceRowKey) : undefined;
const qname = faithfulRow?.qname ?? input.row.sourceConcepts[0] ?? input.row.key;
const [prefix, ...rest] = qname.split(':');
const localName = faithfulRow?.localName ?? (rest.length > 0 ? rest.join(':') : qname);
return {
key: input.row.key,
parentSurfaceKey: input.parentSurfaceKey,
label: input.row.label,
conceptKey: faithfulRow?.conceptKey ?? sourceRowKey,
qname,
namespaceUri: faithfulRow?.namespaceUri ?? (prefix && rest.length > 0 ? `urn:unknown:${prefix}` : 'urn:surface'),
localName,
unit: detailUnit(input.row, faithfulRow),
values: { ...input.row.values },
sourceFactIds: [...input.row.sourceFactIds],
isExtension: faithfulRow?.isExtension ?? false,
dimensionsSummary: faithfulRow?.hasDimensions ? ['has_dimensions'] : [],
residualFlag: input.parentSurfaceKey === 'unmapped'
};
}
function baselineForStatement(statement: CompactStatement, rowByKey: Map<string, SurfaceFinancialRow>) {
const anchorKey = statement === 'balance' ? 'total_assets' : 'revenue';
return maxAbsValue(rowByKey.get(anchorKey)?.values ?? {});
}
function materialityThreshold(statement: CompactStatement, baseline: number) {
if (statement === 'balance') {
return Math.max(5_000_000, baseline * 0.005);
}
return Math.max(1_000_000, baseline * 0.01);
}
export function buildCompactHydrationModel(input: {
periods: FinancialStatementPeriod[];
faithfulRows: Record<FinancialStatementKind, TaxonomyStatementRow[]>;
facts: TaxonomyFactRow[];
kpiRows?: StructuredKpiRow[];
}) {
const surfaceRows = structuredClone(EMPTY_SURFACE_ROWS);
const detailRows = structuredClone(EMPTY_DETAIL_ROWS);
let surfaceRowCount = 0;
let detailRowCount = 0;
let unmappedRowCount = 0;
let materialUnmappedRowCount = 0;
for (const statement of Object.keys(SURFACE_DEFINITIONS) as CompactStatement[]) {
const faithfulRows = input.faithfulRows[statement] ?? [];
const facts = input.facts.filter((fact) => fact.statement === statement);
const fullRows = buildStandardizedRows({
rows: faithfulRows,
statement,
periods: input.periods,
facts
});
const rowByKey = new Map(fullRows.map((row) => [row.key, row]));
const faithfulRowByKey = new Map(faithfulRows.map((row) => [row.key, row]));
const statementDetails: SurfaceDetailMap = {};
for (const definition of SURFACE_DEFINITIONS[statement]) {
const contributingRows = definition.rowKey
? [rowByKey.get(definition.rowKey)].filter((row): row is SurfaceFinancialRow => row !== undefined)
: (definition.componentKeys ?? [])
.map((key) => rowByKey.get(key))
.filter((row): row is SurfaceFinancialRow => row !== undefined);
const values = Object.fromEntries(input.periods.map((period) => {
const nextValue = definition.rowKey
? valueForPeriod(rowByKey, definition.rowKey, period.id)
: definition.formula
? (() => {
const left = valueForPeriod(rowByKey, definition.formula!.left, period.id);
const right = valueForPeriod(rowByKey, definition.formula!.right, period.id);
return left === null || right === null ? null : left - right;
})()
: sumValues(contributingRows.map((row) => row.values[period.id] ?? null));
return [period.id, nextValue];
})) satisfies Record<string, number | null>;
if (!rowHasAnyValue({ values })) {
continue;
}
const sourceConcepts = [...new Set(contributingRows.flatMap((row) => row.sourceConcepts))].sort((left, right) => left.localeCompare(right));
const sourceRowKeys = [...new Set(contributingRows.flatMap((row) => row.sourceRowKeys))].sort((left, right) => left.localeCompare(right));
const sourceFactIds = [...new Set(contributingRows.flatMap((row) => row.sourceFactIds))].sort((left, right) => left - right);
const hasDimensions = contributingRows.some((row) => row.hasDimensions);
const resolvedSourceRowKeys = Object.fromEntries(input.periods.map((period) => [
period.id,
definition.rowKey
? rowByKey.get(definition.rowKey)?.resolvedSourceRowKeys[period.id] ?? null
: null
]));
const rowsForDetail = definition.componentKeys
? contributingRows
: [];
const details = rowsForDetail
.filter((row) => rowHasAnyValue(row))
.map((row) => buildDetailRow({
row,
parentSurfaceKey: definition.key,
faithfulRowByKey
}));
statementDetails[definition.key] = details;
detailRowCount += details.length;
surfaceRows[statement].push({
key: definition.key,
label: definition.label,
category: definition.category,
templateSection: definition.category,
order: definition.order,
unit: definition.unit,
values,
sourceConcepts,
sourceRowKeys,
sourceFactIds,
formulaKey: definition.formula ? definition.key : null,
hasDimensions,
resolvedSourceRowKeys,
statement,
detailCount: details.length
});
surfaceRowCount += 1;
}
const baseline = baselineForStatement(statement, rowByKey);
const threshold = materialityThreshold(statement, baseline);
const residualRows = fullRows
.filter((row) => row.key.startsWith('other:'))
.filter((row) => rowHasAnyValue(row))
.map((row) => buildDetailRow({
row,
parentSurfaceKey: 'unmapped',
faithfulRowByKey
}));
if (residualRows.length > 0) {
statementDetails.unmapped = residualRows;
detailRowCount += residualRows.length;
unmappedRowCount += residualRows.length;
materialUnmappedRowCount += residualRows.filter((row) => maxAbsValue(row.values) >= threshold).length;
}
detailRows[statement] = statementDetails;
}
const normalizationSummary: NormalizationSummary = {
surfaceRowCount,
detailRowCount,
kpiRowCount: input.kpiRows?.length ?? 0,
unmappedRowCount,
materialUnmappedRowCount,
warnings: []
};
return {
surfaceRows,
detailRows,
normalizationSummary
};
}

View File

@@ -1,400 +0,0 @@
import type { Filing, FinancialStatementKind, TaxonomyStatementRow } from '@/lib/types';
import type { TaxonomyConcept, TaxonomyFact, TaxonomyPresentationConcept } from '@/lib/server/taxonomy/types';
import type { FilingTaxonomyPeriod } from '@/lib/server/repos/filing-taxonomy';
import { classifyStatementRole, conceptStatementFallback } from '@/lib/server/taxonomy/classifiers';
function compactAccessionNumber(value: string) {
return value.replace(/-/g, '');
}
function isUsGaapNamespace(namespaceUri: string) {
return /fasb\.org\/us-gaap/i.test(namespaceUri) || /us-gaap/i.test(namespaceUri);
}
function splitConceptKey(conceptKey: string) {
const index = conceptKey.lastIndexOf('#');
if (index < 0) {
return {
namespaceUri: 'urn:unknown',
localName: conceptKey
};
}
return {
namespaceUri: conceptKey.slice(0, index),
localName: conceptKey.slice(index + 1)
};
}
function localNameToLabel(localName: string) {
return localName
.replace(/([a-z0-9])([A-Z])/g, '$1 $2')
.replace(/([A-Z]+)([A-Z][a-z])/g, '$1 $2')
.replace(/_/g, ' ')
.trim();
}
function createStatementRecord<T>(factory: () => T): Record<FinancialStatementKind, T> {
return {
income: factory(),
balance: factory(),
cash_flow: factory(),
equity: factory(),
comprehensive_income: factory()
};
}
function periodSignature(fact: TaxonomyFact) {
const start = fact.periodStart ?? '';
const end = fact.periodEnd ?? '';
const instant = fact.periodInstant ?? '';
return `start:${start}|end:${end}|instant:${instant}`;
}
function periodDate(fact: TaxonomyFact, fallbackDate: string) {
return fact.periodEnd ?? fact.periodInstant ?? fallbackDate;
}
function parseEpoch(value: string | null) {
if (!value) {
return Number.NaN;
}
return Date.parse(value);
}
function sortPeriods(periods: FilingTaxonomyPeriod[]) {
return [...periods].sort((left, right) => {
const leftDate = parseEpoch(left.periodEnd ?? left.filingDate);
const rightDate = parseEpoch(right.periodEnd ?? right.filingDate);
if (Number.isFinite(leftDate) && Number.isFinite(rightDate) && leftDate !== rightDate) {
return leftDate - rightDate;
}
return left.id.localeCompare(right.id);
});
}
function pickPreferredFact<T extends TaxonomyFact>(facts: T[]) {
if (facts.length === 0) {
return null;
}
const ordered = [...facts].sort((left, right) => {
const leftScore = left.isDimensionless ? 1 : 0;
const rightScore = right.isDimensionless ? 1 : 0;
if (leftScore !== rightScore) {
return rightScore - leftScore;
}
const leftDate = parseEpoch(left.periodEnd ?? left.periodInstant);
const rightDate = parseEpoch(right.periodEnd ?? right.periodInstant);
if (Number.isFinite(leftDate) && Number.isFinite(rightDate) && leftDate !== rightDate) {
return rightDate - leftDate;
}
return Math.abs(right.value) - Math.abs(left.value);
});
return ordered[0] ?? null;
}
export function materializeTaxonomyStatements(input: {
filingId: number;
accessionNumber: string;
filingDate: string;
filingType: '10-K' | '10-Q';
facts: TaxonomyFact[];
presentation: TaxonomyPresentationConcept[];
labelByConcept: Map<string, string>;
}) {
const periodBySignature = new Map<string, FilingTaxonomyPeriod>();
const compactAccession = compactAccessionNumber(input.accessionNumber);
for (const fact of input.facts) {
const signature = periodSignature(fact);
if (periodBySignature.has(signature)) {
continue;
}
const date = periodDate(fact, input.filingDate);
const id = `${date}-${compactAccession}-${periodBySignature.size + 1}`;
periodBySignature.set(signature, {
id,
filingId: input.filingId,
accessionNumber: input.accessionNumber,
filingDate: input.filingDate,
periodStart: fact.periodStart,
periodEnd: fact.periodEnd ?? fact.periodInstant ?? input.filingDate,
filingType: input.filingType,
periodLabel: fact.periodInstant && !fact.periodStart
? 'Instant'
: fact.periodStart && fact.periodEnd
? `${fact.periodStart} to ${fact.periodEnd}`
: 'Filing Period'
});
}
const periods = sortPeriods([...periodBySignature.values()]);
const periodIdBySignature = new Map<string, string>(
[...periodBySignature.entries()].map(([signature, period]) => [signature, period.id])
);
const presentationByConcept = new Map<string, TaxonomyPresentationConcept[]>();
for (const node of input.presentation) {
const existing = presentationByConcept.get(node.conceptKey);
if (existing) {
existing.push(node);
} else {
presentationByConcept.set(node.conceptKey, [node]);
}
}
const enrichedFacts = input.facts.map((fact, index) => {
const nodes = presentationByConcept.get(fact.conceptKey) ?? [];
const bestNode = nodes[0] ?? null;
const statementKind = bestNode
? classifyStatementRole(bestNode.roleUri)
: conceptStatementFallback(fact.localName);
return {
...fact,
__sourceFactId: index + 1,
statement_kind: statementKind,
role_uri: bestNode?.roleUri ?? null
};
});
const rowsByStatement = createStatementRecord<TaxonomyStatementRow[]>(() => []);
const conceptByKey = new Map<string, TaxonomyConcept>();
const groupedByStatement = createStatementRecord<Map<string, typeof enrichedFacts>>(() => new Map());
for (const fact of enrichedFacts) {
if (!fact.statement_kind) {
continue;
}
const group = groupedByStatement[fact.statement_kind].get(fact.conceptKey);
if (group) {
group.push(fact);
} else {
groupedByStatement[fact.statement_kind].set(fact.conceptKey, [fact]);
}
}
for (const statement of Object.keys(rowsByStatement) as FinancialStatementKind[]) {
const conceptKeys = new Set<string>();
for (const node of input.presentation) {
if (classifyStatementRole(node.roleUri) !== statement) {
continue;
}
conceptKeys.add(node.conceptKey);
}
for (const conceptKey of groupedByStatement[statement].keys()) {
conceptKeys.add(conceptKey);
}
const orderedConcepts = [...conceptKeys]
.map((conceptKey) => {
const presentationNodes = input.presentation.filter(
(node) => node.conceptKey === conceptKey && classifyStatementRole(node.roleUri) === statement
);
const presentationOrder = presentationNodes.length > 0
? Math.min(...presentationNodes.map((node) => node.order))
: Number.MAX_SAFE_INTEGER;
const presentationDepth = presentationNodes.length > 0
? Math.min(...presentationNodes.map((node) => node.depth))
: 0;
const roleUri = presentationNodes[0]?.roleUri ?? null;
const parentConceptKey = presentationNodes[0]?.parentConceptKey ?? null;
return {
conceptKey,
presentationOrder,
presentationDepth,
roleUri,
parentConceptKey
};
})
.sort((left, right) => {
if (left.presentationOrder !== right.presentationOrder) {
return left.presentationOrder - right.presentationOrder;
}
return left.conceptKey.localeCompare(right.conceptKey);
});
for (const orderedConcept of orderedConcepts) {
const facts = groupedByStatement[statement].get(orderedConcept.conceptKey) ?? [];
const { namespaceUri, localName } = splitConceptKey(orderedConcept.conceptKey);
const qname = facts[0]?.qname ?? `unknown:${localName}`;
const label = input.labelByConcept.get(orderedConcept.conceptKey) ?? localNameToLabel(localName);
const values: Record<string, number | null> = {};
const units: Record<string, string | null> = {};
const factGroups = new Map<string, typeof facts>();
for (const fact of facts) {
const signature = periodSignature(fact);
const group = factGroups.get(signature);
if (group) {
group.push(fact);
} else {
factGroups.set(signature, [fact]);
}
}
const sourceFactIds: number[] = [];
let hasDimensions = false;
for (const [signature, group] of factGroups.entries()) {
const periodId = periodIdBySignature.get(signature);
if (!periodId) {
continue;
}
const preferred = pickPreferredFact(group);
if (!preferred) {
continue;
}
values[periodId] = preferred.value;
units[periodId] = preferred.unit;
const sourceFactId = (preferred as { __sourceFactId?: number }).__sourceFactId;
if (typeof sourceFactId === 'number') {
sourceFactIds.push(sourceFactId);
}
if (group.some((entry) => !entry.isDimensionless)) {
hasDimensions = true;
}
}
if (Object.keys(values).length === 0) {
continue;
}
const row: TaxonomyStatementRow = {
key: orderedConcept.conceptKey,
label,
conceptKey: orderedConcept.conceptKey,
qname,
namespaceUri,
localName,
isExtension: !isUsGaapNamespace(namespaceUri),
statement,
roleUri: orderedConcept.roleUri,
order: Number.isFinite(orderedConcept.presentationOrder)
? orderedConcept.presentationOrder
: rowsByStatement[statement].length + 1,
depth: orderedConcept.presentationDepth,
parentKey: orderedConcept.parentConceptKey,
values,
units,
hasDimensions,
sourceFactIds
};
rowsByStatement[statement].push(row);
if (!conceptByKey.has(orderedConcept.conceptKey)) {
conceptByKey.set(orderedConcept.conceptKey, {
concept_key: orderedConcept.conceptKey,
qname,
namespace_uri: namespaceUri,
local_name: localName,
label,
is_extension: !isUsGaapNamespace(namespaceUri),
balance: null,
period_type: null,
data_type: null,
statement_kind: statement,
role_uri: orderedConcept.roleUri,
authoritative_concept_key: null,
mapping_method: null,
surface_key: null,
detail_parent_surface_key: null,
kpi_key: null,
residual_flag: false,
presentation_order: row.order,
presentation_depth: row.depth,
parent_concept_key: row.parentKey,
is_abstract: /abstract/i.test(localName)
});
}
}
}
for (const fact of enrichedFacts) {
if (conceptByKey.has(fact.conceptKey)) {
continue;
}
conceptByKey.set(fact.conceptKey, {
concept_key: fact.conceptKey,
qname: fact.qname,
namespace_uri: fact.namespaceUri,
local_name: fact.localName,
label: input.labelByConcept.get(fact.conceptKey) ?? localNameToLabel(fact.localName),
is_extension: !isUsGaapNamespace(fact.namespaceUri),
balance: null,
period_type: null,
data_type: fact.dataType,
statement_kind: fact.statement_kind,
role_uri: fact.role_uri,
authoritative_concept_key: null,
mapping_method: null,
surface_key: null,
detail_parent_surface_key: null,
kpi_key: null,
residual_flag: false,
presentation_order: null,
presentation_depth: null,
parent_concept_key: null,
is_abstract: /abstract/i.test(fact.localName)
});
}
const concepts = [...conceptByKey.values()];
const factRows = enrichedFacts.map((fact) => ({
concept_key: fact.conceptKey,
qname: fact.qname,
namespace_uri: fact.namespaceUri,
local_name: fact.localName,
data_type: fact.dataType,
statement_kind: fact.statement_kind,
role_uri: fact.role_uri,
authoritative_concept_key: null,
mapping_method: null,
surface_key: null,
detail_parent_surface_key: null,
kpi_key: null,
residual_flag: false,
context_id: fact.contextId,
unit: fact.unit,
decimals: fact.decimals,
precision: fact.precision,
nil: fact.nil,
value_num: fact.value,
period_start: fact.periodStart,
period_end: fact.periodEnd,
period_instant: fact.periodInstant,
dimensions: fact.dimensions,
is_dimensionless: fact.isDimensionless,
source_file: fact.sourceFile,
}));
const dimensionsCount = enrichedFacts.reduce((total, fact) => {
return total + fact.dimensions.length;
}, 0);
return {
periods,
statement_rows: rowsByStatement,
concepts,
facts: factRows,
dimensionsCount
};
}

View File

@@ -27,6 +27,31 @@
"not_meaningful_for_pack": false,
"warning_codes_when_used": []
},
"cost_of_revenue": {
"direct_authoritative_concepts": [
"us-gaap:CostOfRevenue"
],
"direct_source_concepts": [
"CostOfRevenue",
"CostOfGoodsSold",
"CostOfSales",
"CostOfGoodsAndServicesSold",
"CostOfGoodsAndServiceExcludingDepreciationDepletionAndAmortization",
"CostOfProductsSold",
"CostOfServices"
],
"component_surfaces": {
"positive": [],
"negative": []
},
"component_concept_groups": {
"positive": [],
"negative": []
},
"formula": "direct",
"not_meaningful_for_pack": false,
"warning_codes_when_used": []
},
"gross_profit": {
"direct_authoritative_concepts": [
"us-gaap:GrossProfit"
@@ -38,24 +63,13 @@
"positive": [
"revenue"
],
"negative": []
"negative": [
"cost_of_revenue"
]
},
"component_concept_groups": {
"positive": [],
"negative": [
{
"name": "cost_of_revenue",
"concepts": [
"us-gaap:CostOfRevenue",
"us-gaap:CostOfGoodsSold",
"us-gaap:CostOfSales",
"us-gaap:CostOfGoodsAndServicesSold",
"us-gaap:CostOfGoodsAndServiceExcludingDepreciationDepletionAndAmortization",
"us-gaap:CostOfProductsSold",
"us-gaap:CostOfServices"
]
}
]
"negative": []
},
"formula": "subtract",
"not_meaningful_for_pack": false,

View File

@@ -16,6 +16,64 @@
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "cost_of_revenue",
"statement": "income",
"label": "Cost of Revenue",
"category": "surface",
"order": 20,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:CostOfRevenue",
"us-gaap:CostOfGoodsSold",
"us-gaap:CostOfSales",
"us-gaap:CostOfGoodsAndServicesSold",
"us-gaap:CostOfGoodsAndServiceExcludingDepreciationDepletionAndAmortization",
"us-gaap:CostOfProductsSold",
"us-gaap:CostOfServices"
],
"allowed_authoritative_concepts": ["us-gaap:CostOfRevenue"],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "gross_profit",
"statement": "income",
"label": "Gross Profit",
"category": "surface",
"order": 30,
"unit": "currency",
"rollup_policy": "direct_or_formula",
"allowed_source_concepts": ["us-gaap:GrossProfit"],
"allowed_authoritative_concepts": ["us-gaap:GrossProfit"],
"formula_fallback": {
"op": "subtract",
"sources": ["revenue", "cost_of_revenue"],
"treat_null_as_zero": false
},
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "gross_margin",
"statement": "income",
"label": "Gross Margin",
"category": "derived",
"order": 35,
"unit": "percent",
"rollup_policy": "formula_only",
"allowed_source_concepts": [],
"allowed_authoritative_concepts": [],
"formula_fallback": {
"op": "divide",
"sources": ["gross_profit", "revenue"],
"treat_null_as_zero": false
},
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "operating_expenses",
"statement": "income",
@@ -24,12 +82,257 @@
"order": 40,
"unit": "currency",
"rollup_policy": "aggregate_children",
"allowed_source_concepts": ["us-gaap:SellingGeneralAndAdministrativeExpense", "us-gaap:ResearchAndDevelopmentExpense"],
"allowed_source_concepts": ["us-gaap:OperatingExpenses"],
"allowed_authoritative_concepts": ["us-gaap:OperatingExpenses"],
"formula_fallback": "sum(detail_rows)",
"detail_grouping_policy": "group_all_children",
"materiality_policy": "income_default"
},
{
"surface_key": "selling_general_and_administrative",
"statement": "income",
"label": "Selling, General & Administrative",
"category": "surface",
"order": 45,
"unit": "currency",
"rollup_policy": "direct_or_formula",
"allowed_source_concepts": [
"us-gaap:SellingGeneralAndAdministrativeExpense",
"us-gaap:SellingGeneralAndAdministrativeExpenseExcludingEmployeeStockOptionPlanSpecialDividendCompensation"
],
"allowed_authoritative_concepts": ["us-gaap:SellingGeneralAndAdministrativeExpense"],
"formula_fallback": {
"op": "sum",
"sources": ["sales_and_marketing", "general_and_administrative"],
"treat_null_as_zero": true
},
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "research_and_development",
"statement": "income",
"label": "Research & Development",
"category": "surface",
"order": 50,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": ["us-gaap:ResearchAndDevelopmentExpense"],
"allowed_authoritative_concepts": ["us-gaap:ResearchAndDevelopmentExpense"],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "depreciation_and_amortization",
"statement": "income",
"label": "Depreciation & Amortization",
"category": "surface",
"order": 55,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:DepreciationDepletionAndAmortization",
"us-gaap:DepreciationAmortizationAndAccretionNet",
"us-gaap:DepreciationAndAmortization",
"us-gaap:DepreciationAmortizationAndOther",
"us-gaap:CostOfGoodsAndServicesSoldDepreciationAndAmortization"
],
"allowed_authoritative_concepts": [
"us-gaap:DepreciationDepletionAndAmortization",
"us-gaap:DepreciationAmortizationAndAccretionNet",
"us-gaap:DepreciationAndAmortization",
"us-gaap:DepreciationAmortizationAndOther"
],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "stock_based_compensation",
"statement": "income",
"label": "Stock-Based Compensation",
"category": "surface",
"order": 58,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:ShareBasedCompensation",
"us-gaap:AllocatedShareBasedCompensationExpense"
],
"allowed_authoritative_concepts": [
"us-gaap:ShareBasedCompensation",
"us-gaap:AllocatedShareBasedCompensationExpense"
],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "operating_income",
"statement": "income",
"label": "Operating Income",
"category": "surface",
"order": 60,
"unit": "currency",
"rollup_policy": "direct_or_formula",
"allowed_source_concepts": [
"us-gaap:OperatingIncomeLoss",
"us-gaap:IncomeFromOperations",
"us-gaap:OperatingProfit"
],
"allowed_authoritative_concepts": ["us-gaap:OperatingIncomeLoss"],
"formula_fallback": {
"op": "subtract",
"sources": ["gross_profit", "operating_expenses"],
"treat_null_as_zero": false
},
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "operating_margin",
"statement": "income",
"label": "Operating Margin",
"category": "derived",
"order": 65,
"unit": "percent",
"rollup_policy": "formula_only",
"allowed_source_concepts": [],
"allowed_authoritative_concepts": [],
"formula_fallback": {
"op": "divide",
"sources": ["operating_income", "revenue"],
"treat_null_as_zero": false
},
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "interest_income",
"statement": "income",
"label": "Interest Income",
"category": "surface",
"order": 70,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:InterestIncomeOther",
"us-gaap:InvestmentIncomeInterest"
],
"allowed_authoritative_concepts": ["us-gaap:InterestIncomeOther"],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "interest_expense",
"statement": "income",
"label": "Interest Expense",
"category": "surface",
"order": 75,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:InterestIncomeExpenseNonoperatingNet",
"us-gaap:InterestExpense",
"us-gaap:InterestAndDebtExpense"
],
"allowed_authoritative_concepts": ["us-gaap:InterestExpense"],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default",
"sign_transform": "absolute"
},
{
"surface_key": "other_non_operating_income",
"statement": "income",
"label": "Other Non-Operating Income",
"category": "surface",
"order": 78,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:OtherNonoperatingIncomeExpense",
"us-gaap:NonoperatingIncomeExpense"
],
"allowed_authoritative_concepts": ["us-gaap:OtherNonoperatingIncomeExpense"],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "pretax_income",
"statement": "income",
"label": "Pretax Income",
"category": "surface",
"order": 80,
"unit": "currency",
"rollup_policy": "direct_or_formula",
"allowed_source_concepts": [
"us-gaap:IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterest",
"us-gaap:IncomeBeforeTaxExpenseBenefit",
"us-gaap:PretaxIncome"
],
"allowed_authoritative_concepts": ["us-gaap:IncomeBeforeTaxExpenseBenefit"],
"formula_fallback": {
"op": "sum",
"sources": ["net_income", "income_tax_expense"],
"treat_null_as_zero": false
},
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "income_tax_expense",
"statement": "income",
"label": "Income Tax Expense",
"category": "surface",
"order": 85,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": ["us-gaap:IncomeTaxExpenseBenefit"],
"allowed_authoritative_concepts": ["us-gaap:IncomeTaxExpenseBenefit"],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "effective_tax_rate",
"statement": "income",
"label": "Effective Tax Rate",
"category": "derived",
"order": 87,
"unit": "percent",
"rollup_policy": "formula_only",
"allowed_source_concepts": [],
"allowed_authoritative_concepts": [],
"formula_fallback": {
"op": "divide",
"sources": ["income_tax_expense", "pretax_income"],
"treat_null_as_zero": false
},
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "ebitda",
"statement": "income",
"label": "EBITDA",
"category": "derived",
"order": 88,
"unit": "currency",
"rollup_policy": "formula_only",
"allowed_source_concepts": [],
"allowed_authoritative_concepts": [],
"formula_fallback": {
"op": "sum",
"sources": ["operating_income", "depreciation_and_amortization"],
"treat_null_as_zero": true
},
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "net_income",
"statement": "income",
@@ -44,6 +347,100 @@
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "net_income_attributable_to_common",
"statement": "income",
"label": "Net Income Attributable to Common",
"category": "surface",
"order": 92,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:NetIncomeLossAvailableToCommonStockholdersBasic"
],
"allowed_authoritative_concepts": [
"us-gaap:NetIncomeLossAvailableToCommonStockholdersBasic"
],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "diluted_eps",
"statement": "income",
"label": "Diluted EPS",
"category": "surface",
"order": 100,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:EarningsPerShareDiluted",
"us-gaap:DilutedEarningsPerShare"
],
"allowed_authoritative_concepts": [
"us-gaap:EarningsPerShareDiluted"
],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "basic_eps",
"statement": "income",
"label": "Basic EPS",
"category": "surface",
"order": 105,
"unit": "currency",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:EarningsPerShareBasic",
"us-gaap:BasicEarningsPerShare"
],
"allowed_authoritative_concepts": [
"us-gaap:EarningsPerShareBasic"
],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "diluted_shares",
"statement": "income",
"label": "Diluted Shares Outstanding",
"category": "surface",
"order": 110,
"unit": "shares",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:WeightedAverageNumberOfDilutedSharesOutstanding",
"us-gaap:WeightedAverageNumberOfShareOutstandingDiluted"
],
"allowed_authoritative_concepts": [
"us-gaap:WeightedAverageNumberOfDilutedSharesOutstanding"
],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "basic_shares",
"statement": "income",
"label": "Basic Shares Outstanding",
"category": "surface",
"order": 115,
"unit": "shares",
"rollup_policy": "direct_only",
"allowed_source_concepts": [
"us-gaap:WeightedAverageNumberOfSharesOutstandingBasic",
"us-gaap:WeightedAverageNumberOfShareOutstandingBasicAndDiluted"
],
"allowed_authoritative_concepts": [
"us-gaap:WeightedAverageNumberOfSharesOutstandingBasic"
],
"formula_fallback": null,
"detail_grouping_policy": "top_level_only",
"materiality_policy": "income_default"
},
{
"surface_key": "cash_and_equivalents",
"statement": "balance",