File Processing and Page Balance

Legacy Beta Plan

This documentation page is only relevant for the legacy beta plan with cloud-based file processing.

Beaver uploads and processes your PDF attachments to enable full-document search. During the beta period, this processing is free for up to 125,000 pages (approximately 4,000 articles).

Processing Limits

LimitValue
Maximum file size50 MB
Maximum pages per file500 pages
Free page balance (beta)125,000 pages

How Page Balance Works

  1. Files are queued by modification date – More recently modified files are processed first
  2. Pages are deducted upon processing – Each successfully processed file deducts its page count from your balance
  3. Failed processing is refunded – If processing fails, the page credits are returned
  4. Zero balance stops new processing – When your balance reaches 0, additional files wait in queue

You can view your current page balance in Beaver's settings.

What Happens When Balance Is Exhausted

  • Files already processed remain fully searchable
  • New files will not be processed until balance is available
  • You can still use metadata search and search within already-processed documents
  • Deleting files does not restore balance (processing costs are incurred at processing time)

Managing Large Libraries

If your library exceeds 125,000 pages:

  • Select specific libraries: In settings, choose which Zotero libraries to sync with Beaver
  • Prioritize recent work: More recent files are processed first, so your active research is available

Under the Hood

PDF processing involves converting documents to structured text with sentence-level indexing, generating embeddings for semantic search, and indexing for keyword search. This computational work is resource-intensive, which is why page limits exist. The processing pipeline is designed to handle academic articles, reports or books. Other types of documents will be processed but the processing system is not optimized to handle them well (e.g. slideshows, table-only reports).