Skip to main content

Document Processing

When you upload a document to LOQI, it goes through a multi-step processing pipeline to make it searchable and citable by the AI.

Processing stages

StageWhat happens
ReadingText is extracted from the file (PDF, Word, Excel, etc.)
OrganizingThe text is split into meaningful chunks (paragraphs, sections)
IndexingEach chunk is converted to a vector embedding for semantic search
CompleteThe document is ready for AI use

A progress indicator shows the current stage and percentage.

Supported file types

TypeExtensionsNotes
PDF.pdfIncludes scanned PDFs (processed via vision extraction)
Word.docxFull text and formatting preserved
Excel.xlsxSheet content extracted as text
CSV.csvTabular data extracted
Images.png, .jpgText extracted via vision AI
Text.txtDirect text ingestion

Processing time

Processing time depends on:

  • File size — Larger files take longer
  • File type — Scanned PDFs (image-based) take longer than text-based PDFs
  • Number of pages — A 10-page PDF processes in under a minute; a 200-page document may take several minutes

Scanned documents

For scanned PDFs (where text cannot be extracted directly):

  1. LOQI uses vision AI to read each page
  2. Text is extracted from the images
  3. The document then goes through the normal chunking and indexing pipeline

This takes longer but ensures even scanned documents are fully searchable.

Troubleshooting

If a document shows a failed status:

  • Check that the file is not corrupted or password-protected
  • Try re-uploading the file
  • Very large files (over 150 MB) may time out — try splitting them
File size limits

The maximum upload size is 150 MB per file. For larger documents, split them into smaller parts before uploading.