Implement dual-model filing pipeline with Ollama extraction

2026-02-28 16:31:25 -05:00
parent 0615534f4b
commit a09001501e
16 changed files with 872 additions and 51 deletions
--- a/README.md
+++ b/README.md
@@ -14,7 +14,9 @@ Turbopack-first rebuild of a fiscal.ai-style terminal with Vercel AI SDK integra
 - Eden Treaty for type-safe frontend API calls
 - Workflow DevKit Local World for background task execution
 - SQLite-backed domain storage (watchlist, holdings, filings, tasks, insights)
- Vercel AI SDK (`ai`) + Zhipu community provider (`zhipu-ai-provider`) for analysis tasks (hardcoded to `https://api.z.ai/api/coding/paas/v4`)
+- Vercel AI SDK (`ai`) with dual-model routing:
+  - Ollama (`@ai-sdk/openai`) for lightweight filing extraction/parsing
+  - Zhipu (`zhipu-ai-provider`) for heavyweight narrative reports (`https://api.z.ai/api/coding/paas/v4`)

 ## Run locally

@@ -45,7 +47,9 @@ docker compose up --build -d
 ```

 For local Docker, host port mapping comes from `docker-compose.override.yml` (default `http://localhost:3000` via `APP_PORT`).
-The app calls Zhipu directly via AI SDK and always targets the Coding API endpoint (`https://api.z.ai/api/coding/paas/v4`), so no extra AI gateway container is required.
+The app calls Zhipu directly via AI SDK for heavy reports and calls Ollama for lightweight filing extraction.
+When running in Docker and Ollama runs on the host, set `OLLAMA_BASE_URL=http://host.docker.internal:11434`.
+Zhipu always targets the Coding API endpoint (`https://api.z.ai/api/coding/paas/v4`).
 On container startup, the app applies Drizzle migrations automatically before launching Next.js.
 The app stores SQLite data in Docker volume `fiscal_sqlite_data` (mounted to `/app/data`) and workflow local data in `fiscal_workflow_data` (mounted to `/app/.workflow-data`).

@@ -90,6 +94,10 @@ ZHIPU_API_KEY=
 ZHIPU_MODEL=glm-4.7-flashx
 # optional generation tuning
 AI_TEMPERATURE=0.2
+
+OLLAMA_BASE_URL=http://127.0.0.1:11434
+OLLAMA_MODEL=qwen3:8b
+OLLAMA_API_KEY=ollama
 SEC_USER_AGENT=Fiscal Clone <support@fiscal.local>

 WORKFLOW_TARGET_WORLD=local
@@ -98,6 +106,7 @@ WORKFLOW_LOCAL_QUEUE_CONCURRENCY=100
 ```

 If `ZHIPU_API_KEY` is unset, the app uses local fallback analysis so task workflows still run.
+If Ollama is unavailable, filing extraction falls back to deterministic metadata-based extraction and still proceeds to heavy report generation.
 `ZHIPU_BASE_URL` is deprecated and ignored; runtime always uses `https://api.z.ai/api/coding/paas/v4`.

 ## API surface