File Processing and Page Balance
Legacy Beta Plan
This documentation page is only relevant for the legacy beta plan with cloud-based file processing.
Beaver uploads and processes your PDF attachments to enable full-document search. During the beta period, this processing is free for up to 125,000 pages (approximately 4,000 articles).
Processing Limits
| Limit | Value |
|---|---|
| Maximum file size | 50 MB |
| Maximum pages per file | 500 pages |
| Free page balance (beta) | 125,000 pages |
How Page Balance Works
- Files are queued by modification date – More recently modified files are processed first
- Pages are deducted upon processing – Each successfully processed file deducts its page count from your balance
- Failed processing is refunded – If processing fails, the page credits are returned
- Zero balance stops new processing – When your balance reaches 0, additional files wait in queue
You can view your current page balance in Beaver's settings.
What Happens When Balance Is Exhausted
- Files already processed remain fully searchable
- New files will not be processed until balance is available
- You can still use metadata search and search within already-processed documents
- Deleting files does not restore balance (processing costs are incurred at processing time)
Managing Large Libraries
If your library exceeds 125,000 pages:
- Select specific libraries: In settings, choose which Zotero libraries to sync with Beaver
- Prioritize recent work: More recent files are processed first, so your active research is available
Under the Hood
PDF processing involves converting documents to structured text with sentence-level indexing, generating embeddings for semantic search, and indexing for keyword search. This computational work is resource-intensive, which is why page limits exist. The processing pipeline is designed to handle academic articles, reports or books. Other types of documents will be processed but the processing system is not optimized to handle them well (e.g. slideshows, table-only reports).