Skip to content

Retrieval and Evidence

Every factual statement in a Memosa memo is backed by a citation that points to the exact place the fact came from. That discipline is the whole point — an investment memo an IC can trust is one where any number traces back to a document. This page explains how evidence gets into the memo, what the citation markers mean to you as a reader, why evidence is sometimes thin, and exactly what to do about it.

When Memosa writes a section, it does not summarize your documents wholesale. For each section it runs a focused retrieval against your deal’s indexed documents — pulling the specific chunks (a page of a PDF, a row of the Excel model, a CoStar table) that are most relevant to that section’s topic. Those retrieved chunks are the evidence the section is allowed to cite. If a fact is not in the retrieved evidence, it does not go in the memo.

This is why what you upload directly determines what the memo can say. The market section cannot cite rent comps that were never uploaded; the financial section cannot cite a debt schedule that is not in the model.

In the rendered memo you will see small footnote markers — for example [1], [2] — attached to claims, and a footnotes block at the end listing each source. Each marker links a specific claim to the specific source chunk it was drawn from.

During generation these citations start life as inline [SRC:n] markers — n is the index of a retrieved source chunk. A multi-stage footnote pass then normalizes them, resolves duplicates, renumbers them in document order, and renders the human-readable footnotes and “Sources Consulted” summary you see at the bottom of the memo.

For you, the practical meaning is simple:

Why a memo cites that source and not another

Section titled “Why a memo cites that source and not another”

When two documents contain the same fact and disagree, Memosa does not average them or pick at random — it applies a fixed source precedence, highest wins:

SourcePrecedenceWhat it is authoritative for
Your direct input100 Corrections and figures you type
Excel model90 Financial figures: pro-forma, returns, debt
CoStar export70 Market, demographics, comparables
PDF (OM, etc.)50 Narrative, sponsor, property description

So a cap rate that appears in both your OM PDF and your Excel model will be cited from the Excel model, because Excel outranks PDF. When sources conflict, the memo footnotes the resolution, and the Canvas Provenance panel shows the winning source and the candidates considered. See Cross-Deal Intelligence for the full precedence story.

Memosa knows which source types each section depends on. The pairing follows real underwriting practice:

  • Market, comparables, competitive analysis depend on CoStar — that is where market, demographics, and comp data live. Excel financials legitimately do not answer a market query, so the system does not expect them there.
  • Financial analysis, key metrics, financial risk depend on the Excel model — pro-forma, cash flow, returns, debt schedule.
  • Sponsor background, property description, and most narrative risk come from the PDF (OM, sponsor materials, legal/regulatory documents).

When a section’s required source type is missing from what the primary retrieval returned, Memosa fires a targeted recovery query for exactly that type before writing the section — a second attempt to surface the missing evidence. Recovery is deliberately scoped: it only fires for a type the section genuinely requires, and it never tries to recover a PDF (those are handled differently) or anything you typed directly.

When evidence is thin or missing — and how to fix it

Section titled “When evidence is thin or missing — and how to fix it”

Thin evidence in a section comes down to a small number of causes. Work through them in order:

  1. You did not upload the source that section needs. The fix is direct: upload the Excel model for financial sections, CoStar exports for market and comps, the PDF for narrative. See Supported Documents.

  2. You uploaded it, but it was classified as the wrong type. This is the sneaky one. A CoStar report classified as a generic PDF lands in the PDF bucket, so the CoStar-dependent sections see zero CoStar evidence — indistinguishable from never uploading it unless you check your upload history. Before concluding “this deal had no market data,” confirm the file is actually present (Slack thread, web-chat file list, or the Canvas documents list). If it is there but the market section is empty, the file was almost certainly misclassified — re-run with the document clearly identifiable as a CoStar export.

  3. The model parsed differently than you read it. If the right source is cited but the figure is wrong, the question moves from retrieval to parsing — check the cell in your Excel model and, if it is correct there, report the discrepancy. See Document Processing.

A well-sourced memo has:

  • A footnote on essentially every factual claim, each tracing to a real chunk.
  • The expected source type cited in each section — CoStar in market and comps, Excel in financials, PDF in narrative.
  • Conflicts resolved transparently, with the winning source visible in the Provenance panel.

If a section is missing the source type it should have, that is the signal to act — almost always by fixing an upload, not by editing around the gap.

  • src/utils/consolidated_retrieval/recovery/source_type_recovery.pySECTION_EXPECTED_TYPES (required vs optional source types per section), _NEVER_RECOVERED (pdf/user excluded), the targeted recovery dispatch.
  • src/intel/ontology/cre_ontology.yamlprecedence_weights (USER_INPUT/EXCEL/COSTAR/PDF) that decide which source a conflicted fact is cited from.
  • src/langchain/workflows/tools/synthesis/content/word_count_utils.py and validation/citation_accuracy_probe.py — the [SRC:n] citation marker form.
  • Native memory: footnote_system.md (the multi-stage footnote pipeline that renders citations), costar_classification_silent_misclassify.md (misclassified CoStar reads as absence), feedback_web_chat_has_excel.md.