Skip to content

What to Upload (Supported Documents)

Memosa turns a deal’s raw documents into a structured, cited investment memo. The quality of that memo depends directly on what you feed it. This page explains the supported inputs, how Memosa decides which number to trust when sources disagree, and how to get the cleanest extraction.

You can submit these documents through either intake path — Slack or web chat. The accepted set is the same; only the upload mechanics differ.

TypeFormatsWhat it isWhat it contributes
PDF — Offering Memorandum.pdfThe sponsor’s deal summary / OMThe deal narrative: business plan, location, sponsor background, story, charts and tables
Excel — Underwriting Model.xlsx, .xls, .xlsmThe financial modelThe hard numbers: rent roll, cash flow, OpEx, returns, sensitivities, the capital stack
CoStar exports.pdfCoStar market / comparables reportsIndependent market context: comps, occupancy, submarket data

A PDF and an Excel model are the two required inputs — Memosa needs both before it will generate. CoStar exports are optional but strongly recommended: they give the memo an independent, third-party read on the market that neither the OM nor the model provides.

A note on file-type limits by path: Slack intake classifies uploads as PDF, Excel, or CoStar. Web-chat upload additionally accepts .csv. In both cases the OM PDF and the Excel model are the backbone of the analysis.

Real deals contain the same figure stated more than once — an NOI in the OM narrative, a different NOI in the model, an occupancy rate in CoStar that does not match the OM’s claim. Memosa does not guess. It applies a fixed data-source precedence ladder, trusting the more authoritative source when two disagree.

From highest authority to lowest:

USER INPUT (100) > EXCEL (90) > COSTAR (70) > PDF (50)

What this means in practice:

  • Your direct input wins over everything. Anything you tell Memosa explicitly (a corrected figure, a clarification) outranks every document.
  • The Excel model beats the OM PDF. Financials live in the model; the OM’s prose is a summary. If the model shows NOI of $2.4M but the OM states $2.1M, Memosa uses $2.4M.
  • CoStar beats the OM PDF for market data. If CoStar reports 95% occupancy but the OM claims 97%, Memosa uses 95%.
  • The OM PDF is the baseline, used where no higher-precedence source covers a fact.

Crucially, Memosa does not hide the disagreement — it notes the discrepancy in the memo (for example, “NOI of $2.4M per underwriting model, vs. $2.1M cited in offering memo”). You get the trustworthy number and visibility into where the documents conflict, which is exactly the kind of thing an investment committee wants flagged.

For how these figures are extracted under the hood — the Excel parser, the PDF page pipeline, CoStar classification, and how the precedence weights are applied during retrieval — see the architecture deep dive on document processing.

At a high level, every uploaded document is processed and indexed into a deal-specific, isolated namespace, then retrieved on demand as the memo’s sections are written. Each claim in the finished memo carries a citation back to the source it came from, so reviewers can trace any number to its origin. The retrieval and citation machinery is covered in the retrieval pipeline and synthesis & footnotes.

Always include both required files. The OM gives the story; the model gives the numbers. With only one, the memo is missing half its evidence.

Add CoStar when you have it. Independent market data materially strengthens the comparables and market-overview sections and gives the memo a source that is neither the sponsor’s narrative nor the sponsor’s model.

Upload a clean, complete underwriting model. Because Excel outranks the OM, the model is doing a lot of the heavy lifting. A complete workbook — rent roll, cash flow, OpEx, returns, sensitivities — produces a far richer financial analysis than a stripped-down one.

Use a text-based PDF where possible. A born-digital OM (selectable text) extracts more cleanly than a scanned image. Memosa handles image-heavy PDFs, but clean text is faster and more accurate.

Name the deal the way you want it to read. The deal name keys the namespace and appears on the memo. Set it deliberately rather than accepting a placeholder.

Correct figures directly if you spot an error. Your direct input is the single highest-precedence source — if you know a document figure is stale, telling Memosa overrides it everywhere.

  • CLAUDE.md — the canonical Data Source Precedence ladder (USER INPUT (100) > EXCEL (90) > COSTAR (70) > PDF (50)) and the RAG embedding / chunking facts.
  • src/intel/ontology/cre_ontology.yaml — the precedence_weights: block (USER_INPUT 100, EXCEL 90, COSTAR 70, PDF 50): the single source of truth for the precedence weights.
  • src/intel/ontology/prose_generator.py — the conflict-resolution guidance the system applies (Excel-over-PDF and CoStar-over-PDF examples; “note the discrepancy” behavior).
  • src/utils/file_classifier.py — classification of uploads into pdf / excel / costar, including content-based CoStar detection of PDFs that don’t say “CoStar”.
  • src/canvas/services/web_intake_config.py — accepted extensions on the web-chat upload path (.pdf, .xlsx, .xls, .csv).
  • src/slack/handlers/file_handler.py — supported intake types on the Slack path (pdf, excel, costar).