What Happens After You Upload a Document
Getting Started
Pipeline Overview
After upload, every document goes through an automatic processing pipeline. You can track progress via status badges on the document:
uploaded -> validating -> validated -> processing -> processed
If an automation is resolved for the document, the workflow continues after processing with automation-related statuses.
Step 1: Validation
Notoria runs three checks on every uploaded file:
- MIME type check - verifies the file matches its claimed format.
- Integrity check - confirms the file is not corrupted.
- Antivirus scan - scans for malware.
The badge shows "validating" during this step. If any check fails, the badge changes to "failed_validation".
Step 2: File Preparation
Before AI analysis starts, Notoria prepares the file for downstream steps:
- corrects image orientation when needed
- generates image versions for previews and document viewing
- extracts raw text from formats that already contain text, including PDFs and supported Office files
Step 3: Transcript and Description
Two core AI steps then run:
- Transcript generation - produces searchable text for handwritten or image-heavy files.
- Description generation - creates a short summary of the document.
Step 4: Version Detection
After Notoria has enough content to compare documents, it checks whether the upload looks like a newer version of something already in the workspace.
When Notoria is confident, it links the file into the same document series and uses the earlier version as context for the next AI steps.
Step 5: AI Enrichment
Several enrichment steps run after the transcript and description are ready:
- Name suggestion - proposes a meaningful document name based on the content.
- Tag suggestion - suggests relevant tags from your existing workspace tags.
- Document type classification - suggests the best matching document type when your workspace has document types configured.
- Memory extraction - identifies key facts and dates to store as Memories.
- Embedding generation - creates a vector representation for semantic search.
Step 6: Automations
Once processing finishes, Notoria can hand the document off to a workspace automation.
Depending on how your workspace is configured, automations can:
- classify the document type
- extract structured field values
- validate document content
- call webhooks
- route the document into the right folder
For more on this layer, see Using Automations.
Reviewing Your Document
Once processing completes, open the document to review the results. You can edit the transcript, description, name, document type, and any custom field values directly. Accept or remove suggested tags.
When Processing Fails
- "failed_validation" - the file may be corrupted or was flagged by the antivirus scan. Re-upload a clean copy of the file.
- "failed_processing" - an AI processing step encountered an error. You can retry processing from the document detail page.
- "failed_automation" - the document processed successfully, but a follow-up automation step failed.
FAQ
Does Notoria process Office files too?
Yes. Notoria supports several Office formats and extracts their text before the AI enrichment steps run.
Can I edit the transcript after processing?
Yes. Open the document and edit the transcript directly. Changes are saved automatically.
What happens if Notoria detects a previous version?
The document can be linked to an earlier file in the same series so later AI steps can use that earlier version as context.