Observability and Infrastructure
This page covers two things engineers reach for constantly: the analytical observability warehouse that every cross-deal read flows through, and the deployment substrate — the Railway services, the per-service Postgres pool budgets, and the Pinecone connection pool — that the whole system runs on.
The observability warehouse
Section titled “The observability warehouse”Memosa has its own self-hosted lakehouse for telemetry: a DuckLake warehouse (DuckDB 1.5.2 engine). The catalog lives on the existing Postgres; the data files live in a native Railway Bucket (S3-compatible — not AWS). For local dev and tests, a local file catalog stands in. This replaced an earlier Parquet-glob design, and the replacement is total: the old ObservabilityReader, the read_parquet glob, and union_by_name are deleted.
Two halves: producers write, consumers read
Section titled “Two halves: producers write, consumers read”The write path and the read path are deliberately separate.
Producers call get_observability_writer().append(table, row) from hot paths. append is synchronous, cheap, and NEVER raises — a production hot path can emit telemetry without any risk of breaking on a warehouse problem. The writer buffers rows (default flush at 100 rows or every 30 seconds ) and a single daemon flush thread serializes them into the catalog via insert_arrow. One flush thread is not an accident — serializing writes sidesteps a known multithreaded-Python DuckLake+Postgres deadlock.
Consumers read through ObservabilityLake — a process-wide singleton (get_observability_lake()), one lock-guarded, attach-once connection per process, with heal-on-error. There is also a read_only attach mode (used by the local telemetry MCP analyst surface) that attaches the catalog READ_ONLY and skips schema creation.
The ObservabilityLake singleton — why attach-once
Section titled “The ObservabilityLake singleton — why attach-once”The load-bearing design constraint is connection budget. DuckDB’s Postgres extension opens its own connection pool per attached database. The deleted glob reader created a fresh in-memory connection on every query() call (15+ call sites) — which would exhaust the per-service Postgres budgets the moment cross-deal reads got busy. The fix is a single shared attached connection per process, with the catalog’s Postgres pool capped at pg_pool_max_connections = 2 so each service process holds ~2 catalog connections, not dozens.
The lake heals on duckdb.Error: any error disposes the connection and the next call re-attaches (and re-runs CREATE TABLE IF NOT EXISTS for every registered table, so reads are safe before the first write). Tables are month(ts)-partitioned on first appearance, guarded so a partition spec doesn’t churn a fresh snapshot on every attach.
The registry — 15 tables, the source of truth
Section titled “The registry — 15 tables, the source of truth”TABLE_SCHEMAS in src/observability/schemas.py is the registry and the source of truth. It currently holds 15 tables — among them deal_finalizations, retrieval_events, agent_discourse_events, user_activity_events, llm_cost_events, readiness_snapshots, pinecone_upsert_events, formula_graph_events, sql_invariant_findings, conflict_resolution_events, excel_zero_chunks_attribution, semantic_critique_events, and checkpoint_size_events. The count drifts upward as tables are added; _attach iterates the registry dynamically, so a new table is created and partitioned on the next attach without a migration.
The consumer pattern
Section titled “The consumer pattern”Product-facing services that need cross-deal / analytical reads use the warehouse, not a fresh round of Postgres aggregation. AuditTrailService is the architectural precedent; about eleven services follow the same shape. The pattern is fixed:
- Stateless class, constructor
(reader: Optional[ObservabilityLake] = None)defaulting toget_observability_lake()— the process-wide singleton, never a per-request connect. - Parameter-bound SQL via
reader.query(sql, params). Table and column names are hardcoded; every external value is bound through a?placeholder. - A
_safe_querywrapper catches(duckdb.Error, Exception)and returns[], logging at DEBUG. Observability reads must NEVER break a route. - ISO-stringify timestamps before returning — DuckDB returns
datetimeobjects; FastAPI needs JSON-serializable output.
Most consumers are route-backed Canvas dashboards (PortfolioInsightsService, UserActivityInsightsService, AuditTrailService, …), but the pattern also runs on hot paths. NamespaceCountService reads SUM(chunks_uploaded) from pinecone_upsert_events as a ground-truth count during source-type recovery, and RetrievalHealthService aggregates retrieval_events per section on the worker’s live memo-generation path. Both are synchronous, shared-singleton, _safe_query-guarded reads — the live-path precedent for treating the warehouse as an active input, not just a dashboard backend.
Anonymity is enforced in Python, not SQL
Section titled “Anonymity is enforced in Python, not SQL”PortfolioInsightsService surfaces only aggregate statistics — medians, counts, ranges — never an individual deal’s name, ID, or raw metrics. The privacy floor is N ≥ 3 deals before any statistic surfaces, and that guard lives in the Python service layer, not in a SQL HAVING clause. Every aggregation query returns a single row, and the service maps a below-threshold count to None (if deal_count < _MIN_DEALS_FOR_AGGREGATE: return None, repeated across each aggregation method). _MIN_DEALS_FOR_AGGREGATE is 3 .
This is intentional on two counts: it keeps the SQL portable (no version-sensitive HAVING semantics), and it makes the privacy guard explicit and auditable in code rather than buried in a query. The universal base filter underneath every aggregation is finalization_status = 'success' — degraded finalizations are recorded but excluded, because their numeric fields (cap rate, LTV, …) are unreliable.
Schema evolution discipline
Section titled “Schema evolution discipline”Schema changes are native ALTER TABLE obs.<table> ADD COLUMN (run by ObservabilityLake). This is safe because old snapshots surface NULL for the new column, and DuckDB’s MEDIAN and AVG ignore NULL naturally — so a historical row still contributes to COUNT(*) but not to a median, which is the right semantic for aggregate stats. A contract test (tests/unit/observability/test_schema_additivity.py) locks this by doing an ADD COLUMN round-trip and asserting the prior row surfaces NULL.
What is forbidden: removing a column, retyping a column, or reusing a name for a different meaning. If you must change semantics, bump a new table name instead.
The Railway deployment substrate
Section titled “The Railway deployment substrate”Memosa runs on Railway across two environments — production (main branch, custom memosa.io domains) and staging (staging branch, Railway-generated URLs). Both run the same code; branch→environment mapping is via deployment triggers.
Application and infrastructure services
Section titled “Application and infrastructure services”Production runs four application services; staging runs a fifth, Evergreen (the nightly autonomous audit worker — production is intentionally asymmetric at four). Plus two Railway-managed infrastructure services.
| Dashboard name | railway.json key | Role |
|---|---|---|
| API | web | Backend HTTP API — health endpoint, image serving, REST endpoints for external callers |
| LangGraph | worker | Slack bot + LangGraph deal-analysis workflows (deep agents, RAG) — the heavy AI tier |
| Web App | canvas | Canvas document editor — FastAPI + ProseMirror, JWT/RBAC, brushes; runs WEB_CONCURRENCY=4 |
| Marketing Site | landing | Public landing page + waitlist — lightweight uvicorn, Resend email |
| Evergreen (staging only) | evergreen | Nightly P-series audit worker (python -m src.evergreen_worker.main); APScheduler fires midnight Pacific |
| Redis-Checkpointer | (infra) | LangGraph AsyncRedisSaver; Canvas pub/sub bridge, edit locks, presence, caches |
| Postgres | (infra) | Structured data via PGStoreClient; also hosts the DuckLake observability catalog |
There is also a daily DuckLake Maintenance cron service that runs scripts/ducklake_maintenance.py — flush-inlined / merge / expire / cleanup — to materialize inlined rows to Parquet and bound catalog growth.
Per-service Postgres pool budgets
Section titled “Per-service Postgres pool budgets”Railway terminates idle Postgres connections at roughly 60 seconds, and its usable connection ceiling is 97 across all services. Earlier, every service claimed a full ~94-connection budget independently, giving a worst case of 4 × 97 = 388 connections against a 97 ceiling. It only ever worked because min_size=0 (on-demand) plus admission control held the real peak far below capacity.
The current design gives each service a budgeted slice, resolved at startup from RAILWAY_SERVICE_NAME (Railway-provided service identity — immutable per service, so it is identity, not a config knob, and does not violate the no-env-var rule). The budgets are sized so asyncpg pools + DuckLake catalog connections stay under the ceiling:
| Service | asyncpg max_size | DuckLake catalog conns | Total |
|---|---|---|---|
worker (LangGraph) | 38 | 2 (single process) | 40 |
canvas (Web App) | 20 | 8 (4 workers × 2) | 28 |
web (API) | 15 | 0 | 15 |
landing (Marketing Site) | 5 | 0 | 5 |
evergreen (staging only) | 5 | 0 | 5 |
| DuckLake Maintenance cron | ~0 | 2 (transient daily) | 2 |
| Grand total | 95 (≤ 97) |
The canvas row is the subtle one. The DuckLake reader is a per-process singleton, and the Web App runs WEB_CONCURRENCY=4 . So Canvas holds 4 workers × 2 catalog connections = 8, not 2 — a single-worker assumption here undershot by 6 and pushed the real worst case to 101. The asyncpg budget was lowered to 20 (20 + 8 = 28 ≤ 30) to absorb it. worker is single-process (slack_bot.py), so its writer and NamespaceCountService share one lake — 2 catalog connections.
A second mechanism keeps per-process pools bounded under scale-up: _calculate_pool_max_size() divides the service budget by max(RAILWAY_REPLICA_COUNT, 1) × max(WEB_CONCURRENCY, 1), with a floor of 4 per process.
The Pinecone connection pool
Section titled “The Pinecone connection pool”Retrieval runs against Pinecone serverless through a connection pool sized in PineconeConfig (src/config/pinecone_config.py). The values are production-validated and the frozen dataclass refuses construction below their floors:
pool_size = 96. Raised through50 → 72 → 96, each bump a response to pool exhaustion during cascading low-chunk retrievals at 80–93% utilization. Do not lower below 96 without production log evidence;__post_init__enforces1 ≤ pool_size ≤ 100.retrieval_max_parallel = 16andingest_max_parallel = 16. P36 split the formerly-sharedmax_parallel_index_queries=24into separate retrieval and ingest semaphores so heavy upserts stop competing with analysis-phase reads on one semaphore. Their combined ceiling of 32 stays well under the admission threshold against a 96-slot pool;__post_init__enforces both≥ 16andretrieval + ingest ≤ pool_size.connection_timeout_secs = 10.0(must stay below the 30s query-timeout ceiling) andupsert_timeout_secs = 20.0(per attempt, 3 retries with exponential backoff).
The primary gate is not the pool size — it is the PoolAwareAdmissionController in src/vector_db/retrieval_executor.py, which checks real-time pool utilization before admitting a query and does graduated load-shedding. The hard-reject threshold is 88% utilization (admission_hard_reject_pct = 0.88 , source-diversity exempt) — lowered from 95% after a production cascade where the pool saturated 95→100% in a single surge; at 88% (84/96 active) the remaining 12 slots cover the source-diversity exemption and the recovery-priority bypass. Above 92% utilization, shielded recovery on cascade-cancelled sections is skipped, since spawning a recovery task during pool exhaustion only deepens starvation. The ordering of these thresholds is invariant-checked at construction.
How the pool feeds the retrieval stages and reranking is covered on the Retrieval Pipeline and Reranking pages.
Sources
Section titled “Sources”src/observability/ducklake.py—ObservabilityLake(attach-once singleton,pg_pool_max_connections=2, heal-on-error,read_onlymode,month(ts)partitioning,run_maintenance)src/observability/writer.py—ObservabilityWriter.append(sync, never-raises) + the single daemon flush thread;get_observability_writer()src/observability/schemas.py—TABLE_SCHEMASregistry (15 tables; nativeADD COLUMNdiscipline in the module docstring)src/observability/finalize_hook.py—record_deal_finalization()canonical write path (success + degraded)src/observability/namespace_count_service.py,src/observability/retrieval_health_service.py— live-path singleton consumers (_safe_query,_MIN_SAMPLE)src/canvas/services/portfolio_insights_service.py+portfolio_insights_queries.py— N≥3 anonymity guard in Python (_MIN_DEALS_FOR_AGGREGATE),finalization_status='success'filtersrc/utils/production_capacity_config.py— per-service Postgres budgets (POSTGRES_PER_SERVICE_MAX_SIZE,POSTGRES_DUCKLAKE_RESERVED,POSTGRES_RAILWAY_USABLE_CEILING,_RAILWAY_SERVICE_NAME_TO_KEY)src/config/pinecone_config.py—PineconeConfig(pool size, retrieval/ingest semaphores, admission thresholds,__post_init__invariants)railway.json— service definitions (Web App, Marketing Site, Evergreen, DuckLake Maintenance, …).claude/rules/10-domains.md,.claude/rules/20-patterns.md— Railway service table, the observability-warehouse consumer pattern, the per-service pool budgetsmemory/observability_ducklake_migration.md,memory/observability_warehouse_consumers.md,memory/postgres_pool_event_loop_invariants.md— DuckLake migration, the consumer census, and the pool event-loop invariants